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ABSTRACT 


This thesis reoorts the imolementation of a Data Base 
Management System (0BMS) kased on the CODASYL design. The 
DBMS was implemented on a DEC PDP 11/50 computer utilizina 
the UNIX operatina system. Backaround material includes a 
discussion of data base history and techniques, design of 
UNIX and the C orogrammina lanquaae. The research performed 
was the adaptation of the CODASYL DBMS desian to the UNIX 
environment and the desian of aC language Data Descriotion 
Language (DDL) and Data Manipulation Language (DML) to 
interface the DBMS to user proarams. Conclusions and recom= 


mendations for imorovements are also included. 
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Ts INTRODUETION. 


The Conference on Data Systems Lanquage (CODASYL) has 
defined a data base system [Ref. 2 and 3] which is partially 
generalized and partially tailored to COBOL. This system is 
based on network data modelina techniques. CODASYL has made 
the claim that the system could have other lanquages effec- 
tively interfaced to it and that the system could be imple- 
mented in a variety of environments. The Computer Science 
Department has two Digital Eauioment Corporation PDP 11/50 
computers running with the UNIX operating system (Ref. 1). 
The equipment was acquired for research in siaqnal processina 
applications. This environment is one in ania a) CODASTE 


based data base manaqement system has never been introduced. 


Relational data meena: vechaiaques are the major com-= 
petitor with the CODASYL system. Recently, develooment was 
completed on a relational data base management system which 
runs under UNIX (Ref. 4 and 5). This system is called 
INGRESS and was developed at the University of California at 
Berkeley. Currently a discussion is taking place in the 
literature over the relative merits and drawbacks of -rela- 
tional models versus network models (chiefly the CODASYL 
version) (Refs. 65, 7/7, 8&8 and 9). Although much has-~ been 
written about the merits of each model, relatively little 


empirical comoarison has heen done. Therefore, since steps 





are being taken to aquire the INGRESS software, it was 
decided that a CONASYL data management system would orovide 


a comolete suite of data manaaement software. 


The tasks to be accomolished were design and imnolemen- 
tation of a UNIX hosted CODASYL system, design and implemen= 
tation of a C lanauage [Ref. 10] interface to this system, 
aquisition of INGRESS and comparative studies of the two 
Systems for signal orocessing aopnlications. This thesis 
documents the design and imolementation of a UNIX hosted 


CODASYL data base management system and the design of a C 


lanquage interface to this system. 





II. BACKGROUND, 


A. Data Access Methods =- A History. 
i Technological Effects. 


During the first and sceone generations of computer 
hardware, data storage media were tapes, relatively slow 
disks and drums and the omnipresent punched card. Data 
Storage and retrieval conceots were shaved by these devices, 
especially the punched card. A file was therefore a seauen~ 
tially ordered and accessed, contiquously stored group of 
records. All the records were of fixed lenath. This view 
is also oriented toward a monoproaramming environment with 


absolute seoaration of one user's files from another's. 


With the advent. of third generation technoloay, 
several factors began to affect data storage and retrieval 
concepts. Foremost was the development of fast, hiqh caoca- 
Citys relatively inexpensive direct access storaqe devices. 
These devices stimulated the develooment of a whole ranae of 
new access techniques such as hash coded and indexed file 
organization. Secondlys, the multiuser environment caused a 
breakdown of the sharo division between the execution 
environment of users. This breakdown was accompanied by a 


rethinking of the relationshios between the overlapping data 


requirements of users. Al] these factors led to the 
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development oni a new kind of file System known as a3 data 


base. 


2. Access Methods. 


The inspiration for a multi=puroose data base came 
from management systems in which it was discovered that vast 
overlap and duplication of data was occuring between dife 
ferent groups in a company. For examole the payroll) and 
personnel sections would typically each have emoloyee files 
which were stored and maintained separately but which overe 
lapped by 80 ver cent in data content. An early management 
information system (MIS) which attacked this problem was 
IBM's Bill of Material Program (BOMP) (Ref. 12) which 
allowed the structurina of a oarts list with subassemblies 
each having itS own parts Jlistss this facilitated the 
management of manufacturing inventories. When the subassem= 
blies occured in many different parts, the Savings afforded 
through avoided data duplication were significant. The BOMP 
used a relatively flexible list structure and marked a sigs 


nificant deoarture from traditional file oraanization. 


With the advent of the consolidated _multispurpose 
data base, a whole new level of data structurina was imposed 
on the techniaues for physically maopina files to devices. 
These data structures emoloyed relatively complex methods 
from graph theory and other disiolines which had previously 
been used only on relatively smal! amounts of data residing 


thamanime Storaqe. 


ee 





This new level of data base structure combined with 
the fact that a very large portion of the data base is typi- 
cally onsline, has imoosed on owroarammers the reauirement 
for a new skill. This skill has been called navigation 
through the data base (Ref. 13). In this view, a programmer 
must travel via access paths through the data base searchina 
for landmarks until he has located the data he desires. 
Choosing an inappropriate access path can be extraordinarily 
inefficient and costly in time, so the penalty for lack of 
Navigational skill is hiqh. [It would obviously be desirable 
to remove as much of the burden for navigation from the iro = 
Qrammer as 18 oractical. The develovement of modern data 
base management svstems (DBMS) has been made difficult by 
the dilemma. of desiring both optimal access paths and ease 


and simolicity of use for the programmer. 
3. Terminoloay. 


This section will attemot to define the terms used 
in this field of inquiry. The following definition of a data 
base is due to Ref. 13:3 

"A data base may be defined as a collection of 

interrelated data stored toaqether with as little 
redundancy as possible to serve one or more anplica= 
tions in an optimal fashion; the data are stored so 
that they are independent of orograms which use the 
data; a common” and controlled aporoach 1s used in 
adding new data and on modifying and retrievina 


existina data within the data base. One system is 
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said to contain a collection of data bases i f they 


are entirely separate itn structure." 


Two types of languaces are mentioned in connection 
with DBMS, The first is the Data Description Language (DDL) 
which describes the types of data entities which may exist 
along with the allowable attributes. There may be two DDL's 
or two levels of DDL for describing a data base. The first 
level description is the system's view of the data base as 
it is actually oraanized and the second, a user's view of 
the data base. These levels are called the schema and sub 
subschema respectively. In the relational model terminol=- 


ogyr the DDL may be called the relational algebra. 


¢ 


The second languaqe is the Data Manioulation 
Language (OML ) which is concerned with the storaade, 
retrieval and modification of specific occurences of the 
entity types described by DDL statements. In relational 
model terminology, this l]lanauage corresponds to the relas= 
tional calculus. The entities handled by DDL and DML may be 
records, sets or anything that may need manipulation. The 
attributes may be such thinas as data items, set membership, 


set ownershio or location within the data base. 


The data base model is the metaestructure which is 
imposed on the organization of the data base. The model 
orescribes the tyoes of entities which are allowed. Tt 
defines the data attributes and structural attributes that 


an entity may have. The definition of a DOL and DML is the 


iTS: 








implementation of the meta-estructures of a data base model. 
Currently the two most widely discussed models are the nete- 


work model and the relational model. 
4. Goals of a DBMS, 


The following goals have been proocosed for a DBMS 
tmef. 3). 

- Allow the data structures suited to each particu 
lar application while permitting multiole applications to 


use the data without need for data redundancy. 
( 


a 


- Allow more than one process to concurrently 
retrieve or update data in the data base. 

- Enable the use of a variety of search strategies 
against an entire data base or a portion of it. 

- Provide protection of data from unauthorized 
access. 


Provide centralized control over the olacement of 


data. 
- Provide device independence for proarams. 


Allow the user to interact with the data but be 


free of the mechanics of maintaining the structural associa- 
tions which have been declared. 

- Allow as areat an independence of proarams' from 
data and structures as possible. 

- Make the data description independent of any pare 
ticular orogramming lanauage but aive it the capability of 


interfacing with a variety of proaramming lanquages. 
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These goals seem to be aenerally agreed on in the 
literature as being reasonably complete. There is consider= 
able controversver however, over the relative importance of 
individual qoals. In particular, some contend that the pri- 
mary goal should be to allow the user to be free of the data 


base structures entirely (Ref. Or. 


Be. The Network Model. 


One of the two data base models which has received wide 
attention is the network model. This model is arounded in 
graoh theory and relationshios between data are represented 
by some form of Girected arach. The nodes of the araph may 
be entities containing data attributes or may simply be 
olace holders whose only attributes are the ares of the 
graph. The arcs reoresent loaical links between the enti 
ties which can be travelled in the direction of the are to 
Navigate through the data base. ThuS,r even though the 
implementation of the ares may be transparent to the user, 
the access paths are visible to the user as part of the 
Structure of the data base. The DML is said to be prescrip= 
tive of the data access oaths, that 1Sr it muSt orescribe 


the course through the data structures. 


Various restrictions as to the tyne of network allowed 
may be imposed on a network model. For example, the graphs 
may be required to be acyclic or the structures may be res- 
tricted to treesr chains or lists. A nonhomogeneous model 


has the restriction that i f two nodes are connected by an 
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arc then the entities reoresented by the nodes may not be of 
the same type. A special class of the non=homogeneous net- 
work model is the hierarchical model. Hierarchical models 
have the following restrictions: the Qraphs must be trees, 
an entity of any type may appear only once on a particular 
branch of the tree and certain entity types must always 
appear on a given branch at a higher level than other entity 
tvpes. An example of a hierarchical data base would be one 
with the entities country, state, county and city. For a 
particular country, one or more of the entities of state and 
county may be left out between a country and its cities, but 
a city cannot appear above a state, nor can it appear above 
another city. An example of a nonwhierarchical data base is 
the BOMP in which a subassembly may contain other subassem-= 
blies which may in turn contain subassemblies. Note that 
the BOMP is homogeneous since subassemblies are linked to 


subassemblies. 


Cz The Relational Model. 


The second of the two most discussed modes for data 
representation is the relational model. This model is 
grounded in set theory and specifically in the concept of a 
relation in the mathematical sense. Given sets Sly Sey cecrs 
Sn (not necessarily distinct), R is a relation on these sets 
marc is a Sube=set of the Cartesian oroduct of Si x S2@ x ... 
x Sn. The element of R are ne=tuples whose jth component is 


from Sj-s for 3} feomeoneuto Mo Kk 1S Said to be an nary 


es) 





relation or of dearee n and Sj is called the jth domain of 


R. 


In the relational model, each set Sj must be a set of 
like-tyoe attributes. The relations are named and time 
variant according to some maintenance algorithm. An examole 
of a relation would be a set called time=spectrum made up of 
frequency, time and amplitude triples. This ternary rela- 
tion might represent the latest 30 minutes of data from a 
hydrophone. Every relation must have a key by which the 
tuples can be identified. The key must be uniaue (no two 
tuples with the same key) and noneredundant (the whole key 
is needed for Coxe oe In the above example, fre- 


quency and time make up the key. 


The chief advantage of the relational model is that the 
user's view of the data is independent not only of the phy- 
sical mapping to media, but also of the access paths 
involved. The chief disadvantaae is the difficulty of 
devising an imolementation which is reasonably efficient for 


all applications [Ref. 8]. 


A good deal of work has been done on normalizing rela 
tions to remove undesirable data representational charac- 
teristics and providing appropriate operations and transfore- 
mations for relations. References 14, 15 and 16 give a 
definitive exposition on the theory of relational data 


models. 








D. the CODASYL DBTG OBMS, 
Pee Story . 


CODASYL is an informal and voluntary organization 
of interested individuals, supported by their institutions, 
who contribute their efforts and expenses towards the ends 
of designing and developina techniques and lanquages to 
assist in data systems analvsisr desian and implementation. 
Founded in 1959, its most famous achievement has been the 
definition of the COmmon Business Oriented Languace (COBOL). 
In June, 1965, the CODASYL COBOL Language Subcommittee of 
the Proaramming Lanauages Committee (PLC) resolved to organ 
ize a task force to study list processing. In November, 
1965, this task force produced a proposed list processing 
extension to COBOL for file management. In May, 1967, the 
List Processing Task Force chanaed its name to the Data Base 
Task Group (DBTG) and undertook a comparative study of data 
base manaaement techniques and systems. This study was cule 
minated by the publication of an interim report in February, 
1968 and the agreement by the Lanaquage Subcommittee that 
"COBOL needs the Data Base Concept” [Ref. 2]. At the Tenth 
Anniversary Meeting of CODASYL held in May, 1969, considera 
tion was aiven to separating the data description and data 
manipulation languaaqes. The idea received wide endorsement 
at the meetina and was the basis for the direction of 
efforts by the DBTG until October, 1969 at which time Ref, 
17 was presented. From the time of publication of Ref. 17 


until the publication of Pef. 3 in Aorils 1971, 179 
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proposals for changes and extensions to the DDL and DML were 
considered, of which 130 were incoroorated into Ref. 3. In 
June, 1971, it was decided that the schema DDL should be 
developed separately from the COBOL DDL and OML. Accord= 
ingly, the Data Descriotion Lanquaae Committee (DDLC) was 
formed as a separate oraanization from the PLC. The ODDOLC 
oroceded with modifications and enhancements to the DBMS and 
schema DDL definitions and, in June,r 1973, produced Ref. 2. 


This document 1S currently the basis for the CODASYL DBMS, 
2. Terminoloay and Concepts. 


For a comolete descristion of the CODASYL schema DDL 
statements and DBMS desian see Ref. 2. The schema DDL is 
used to describe a data base and has the following entity 
types: Data items, data agqaqregqates, records, areas and 


sets. 


A data item is an occurence of a named atomic data 
aecribute. It is the smallest unit of named data. The set 
of values that a data item can assume is called its ranae. 
The range of an item is always restricted to values of a 
particular tyoe. The possible tynes are arithmetic data, 


string data, data base keys and imoplementor defined types. 


A data agaregate is an occurence of a named collec- 
tion of data items. There are two kinds: vectors and 
repeating groups. A vector iS a one dimensional seaquence of 
data items, all with identical characteristics. A repeating 


— 


Group is a collection of data attributes that occurs 
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multiple times within a2 record occurence. The collection of 


attributes may include data items and data agoregates. 


- 


A record is an occurence of a named collection of 
zero or more data 1tems or data aagreaates. Each record 
entry defines a record type of which there may be zero or 
more occurences within the data base. The record is the 


smallest addressable entity within the data base. 


A set is a named collection of records. Each set 
entry in the schema defines a set type for which zero or 
more occurences (sets) may exist in the data base. Each set 
type declared in the schema must have one record type 
declared as its owner and may have one or more record types 
declared as its members. Each set occurence which exists in 
the data base must contain exactly one record of its owner 
type and zero or more of its member record types. A special 
set type may be delared which has one and only one occurence 
and whose owner is the DBMS. A set so declared is said to 
be a singular set. There is no provision for a record type 
to be both an owner and member record type of the same set. 
This means the CODASYL model is non=homogeneous. It is not, 
however, hierarchical in that set types may be defined with 


ownershio and membershio such that cycles can occur. 


An area 1s a named collection of records which need 
not preserve owner/member relations. An area may contain 
occurences of multiole record types and a recora tyoe may 


occur In multiole areas. A oarticular recora occurence of a 
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record is assigned to an area when it when it is created and 
it may not migrate out of that area. An area may be 
declared to be temporary. Temoorary areas are created espe- 
cially for a runeunity exist for the life of the run-unit 
and are destroyed when the process terminates. Two run 
units may have a particular temporary area open concurrently 
but each runsunit 1S using a different version of the area 
woveh 1S unique to that particular run-unit. The concept of 
area allows the subdivision of the data base. It allows the 
DBMS to control placement of an entire area to provide effi- 
cient storage and retrieval. Areas are a convenient unit 
for recovery and also provide a convients natural subdivi- 
sion for allowing a part of the data base to be removed to 


Cd 


off-line storage. 


A schema consists of ODL entries and is a complete 
description of a data base. It includes ae names and 
descriotions of all areas, set tyoes and record tyoes that 
may aopear in the data base. A data base is the totality of 
all records, sets and areas controlled by a schema. For an 
installation to have multiole data basesr it must have mule 


tiple schemas and the content of the data bases must be dis= 


joint. 


No schema DOL entry may include references to the 
ohysical devices or media soace. Thus a schema written in 
the DOL is indeoendent of the physical storage of data and 
the data may be stored on any combination of storage media 


avaiiaole to a DBMS. Some devicesr due to their sequential 
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natures, may not. allow the full advantages of DOL facilities, 


however the use of these devices is not precluded. 


A program 1S a set or aroup of instructions. User 
programs must have access to a sub-schema DDL description of 
that portion of the data base they are interested in. Addi- 
tionally, they must be able to use a DML to interact with 


the data base through the DBMS, 


A -runseunit is the execution of one or more proaqrams 
viewed by the ooerating system as a unit. Under 08/360, the 
runsunit might be a job and under UNIX, a parent process and 
any children. The runsunit makes requests of the DBMS which 
in turn consults the schema and interacts with the operating 


system to fulfill the request. 


A user working area (UWA) is conceptually a loading 
and unloading zone where all data orovided to a runsunit by 
the DBMS and all data to be oicked up by the DBMS must by 
placed. The DBMS has its own system buffers which it uses 
to manipulate the data base. It uses the UWA only for inout 
and output of data for the requesting runsunit. Each rune 


unit has its own UWA, 
a The Schema vs. the Sub-schema. 


The subschema has the following characteristics. An 
arbitrary number of cossibly overlapoing sub-schemas may be 
declared. Multiole proarams may reference a sub=schema but 


they have access only to that portion of the data base 
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included itn the subeschema. Thus, the sub-schema ODL 
description enables the subsettina of the data base so that 
a user program need only worry about that portion of the 
data base it uses, and insulates the remainder of the data 
base from the user. A measure of the data independence is 
provided between the schema and subseschema. The sub~schema 


description may differ from the schema in the following 


WAYS e 
@.e Data Item Level. 


Descriptions of items may be omitted. Included 
items may be of a different type or in a different position 


within the record. 
om Data Aggregate Level. 


Descriptions of specific data aaqgqreaqates may be 
omitted. Data agaqregates = and items may have additional 
structure imposed on them (e.a. vectors may become multi- 
dimensional arrays). The position of data agaregates within 


@a record may he chanaed. 


Cz Record Level. 


- 


Descriptions of records may be omitted. 
Descriotions of new record tynes composed of data from other 
record tyoes may be introduced (not Supoorted by the COBOL 


mime DOL’ s). 
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ele Set Level. 


Descriptions of soecific set tyoes may be omit- 
Ped. Different set selection criteria may be specified. 


Descriptions of specific member-record tyoes may be omitted. 
Ce Area Level. 


Descriptions of specific areas and the records 
within them may be omitteds while occurences of the same 


record type in other areas are included. 
oe The Schema and the DML. 


The relationshio between the DDL and the DML is that 
between declarations and procedures. In order to specify 
this relationship, a set of basic data manioulation func 
tions must be defined which is DML and host lanaquage 
independent. Specific commands provided by a particular DML 
must be resolved into these basic functions. Basic func 
tions include the capability of selecting records, present= 
ing them to a runstunit and addinasy changing or removing 


records and relationshios. 
S. Data Base Administration. 


Certain facilities must be availible to support the 
user programs. These tools are not defined in the CODASYL 


DBMS and may include the following. 
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Be Recovery Routines. 


Data base recovery routines may be used includa= 


ing activity loagina, checkpoint and rollback. 
be. Utility Routines. 


Utility and service routines are required to 
support a data base in day-to-day operations. Examples 
tnclude routines for editing and orintings loading and dump- 
iNnGe Dreconditionina, Qarbaae Eomeeenent statistical 


analysis and comparison. 
Ce Schema Meta-langquage. 


This lanaquage pvoermits changes in the schema and 
cause them to be reflected in the data base. Without such a 
language, the changes must be made by definina anew schema 


and recreatina the data base accordingly. 
Gd. Device Media Control Language (DMCL). 


This lanauaae provides for assianment of data to 
devices and media spacey and snecification and control of 


bufferings, paging and overflow. 
6. Data Base Procedures. 


At various pvoints in the accessina of a data hase, 
nonstandard computations or orocessing may be reauired. To 
allow for these situationsr the capability is provided to 


define data base orocedures. These procedures may be 





invoked for checkina of privacy locks, producina comouted 
results from other items, searching alaorithms, data 
compression and expansion, validity checking or system 


instrumentation. 
7. Record Placement Control. 


The schema DDL permits specification of an area or 
areas to which record occurences of a particular type must 
be assiqned. The schema DDL also includes a clause which 
causes records being added = be placed near some other 
record. Conceptually, the effect of such clauses is to 
cause clustering of records which are likely to be used in 
conjunction with one another. These declarations for 
selecting the area and location within the area are the 
WITHIN clause and the LOCATION clause, respectively, of the 
record subentry. The fact that the schema DOL permits 


olacement control is not assumed by CODASYL to have any phy 


sical connotations. 
8. Data Base Keys. 


The DDL assumes that every record occurence in the 
data base has a unique identifier which enables the DBMS to 
distinguish it from every other record in the data base. 
This key must be assiaqned when the record 1s created and 
remains with it for the life of the record. This key may be 
supolied to the DBMS hy a runtunit or data base orocedure, 
generated from the record's contents or assianed by the 


DBMS. The oermanence of the key must be insured since any 
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runw-unit may use the keys to refer to the record. 
She Orderina of sets. 


Each set tyne declared in the schema must have. an 
ordering soecified for it. This order is maintained by the 
DBMS and is a logical, not a ohysical, ordering. Thus, the 
Same record occurences could oarticipate aS members in 
several sets of different tvoes and be ordered differently 
in each of the sets. The member records of each occurence 
of a agiven set tyne can be ordered in any of the following 
waySe 

- Sorted in ascending or descending order based on the 
value of specified keys. These keys may be data items in 
the member records, the names of the member records, the 
data base keys of the memoer records or some combination of 
the above. 

- Sorted in the order resultina from inserting new 
member records first in the set, last in the set or before 
or after the set member which 18S currently known to the 
requesting runetunit. 


- Sorted in the order most convenient to the DBMS, 
ro. Search Keys. 


An arbitrary number of search keys may be declared 
for a set tyoe reaardliess of whether it is sorted or not. 
The components of the search keys must be data items 
included in the member records of the set. The declaration 


of a search key causes the NBMS to develoo and use some kind 
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o f indexing for the member records of each occurence of the 
set tyoe.e The term indexina is used here to refer to. any 
technique which does not involve a comolete scan of the 


records involved. 
11. Set Membershio. 


A record type may have different kinds of set 
membership declared for different set types. Automatic 
membershio means that membership is established in an 
approoriate occurence of a set tyoe when a record is added 
to the data base. Manual membershio means that membership 


can only be established in a set occurence by a frunsunit 


executing an insert function. 


Mandatory or optional membership concerns the remo- 
val of a record froma set occurence. Once a record has 
been established as a member of a set for which it has man- 
datory membershio, it cannot be removed until the record is 


deleted. If the membershio is oaptional, the record may 


cease to have membershio via a remove function. 


A set tyoe may be declared as dynamic. A dynamic 
set may have a record of anv type inserted into it or 
removed from it. If a set type is declared to be dynamic, 


no member records may ve declared for it. 


Meee oet oScelection. 


In general, there will be more than one set of a 


Given type in the data base. It is therefore necessary to 
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provide a means for identifying the proper set when member 
records are stored and retrieved. The SELECTION clause of 
the member subentry in the DDL controls the strategy’ for 
selecting a specific set of a given type. A separate SELFEC- 
TION clause is required for each member record type and set 
tyYOe pair. The SELECTION clause provides’ for Naming a 
series of sets which form a continous path to the desired 
set. For al) the sets along the paths, other than the first 
named set, the DBMS limits its search to the member records 


of the set selected at the previous steo in the path. 
ieee Privacy of Data. 


Protection aacainst unauthorized data access iS prom 
vided through a mechanism of privacy locks which are speci- 
fied in the schema. Privacy keys must be orovided by a 
run\unit seeking to access or alter data protected by a 
orivacy lock. The schema DOL provides for declarina privacy 
locks at the schemas, area, records data item, data aagre- 
gate, set and member levels. Locks can be declared for 
specific functions at each of these levels. A privacy lock 
is either a value which must be matched by a&@ corresoonding 
privacy key or a data hase procedure which is called to 
validate the privacy key. If a procedure is used, rt 


returns a yeS or No answers and heyond this the action of 


such a procedure is implementor defined. 
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a Intearity of Data. 


The DOL provides for the eneck ing of the validity 
of a data item whenever a value is changed or a new value is 
stored in the data base. In addition, provision is made for 
the naming of data base orocedures which the DBMS invokes 
when a run-unit attemots to uodate nominated records or 
sets. This feature enables a check of any update or series 


of updates avoolied to the data base. 


E. Laboratory Equioment and Software. 


The comouter equipment in the laboratory consists of two 
PDP 11/50's with associated peripherals. The information 
about the equipment which is relevant to this thesis is 
minimals however it should be noted that the DBMS was 
developed using an interactive disolay terminal and 1s 


oriented toward that environment. 


The operating system which supports the DBMS is UNIX. 
Reference 1 contains a sunolement to the following discus 


sion of UNIX. 
see The File System. 


The most imoortant function of UNIX is to provide a 
file system. From the user's point of view their are three 
kinds of files: ordinary files, directories and special 


files. 
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An ordinary file can contain any information the 
user desires. The system imposes few special structure 
requirements on files, however some proarams expect files of 
acertain format. 4 text file consists of a string of chare- 
acters with lines delimited by new line characters. A 
binary program file is a seauence of words as they will 
appear In main memory when the proaqram iS executed. The 


assembler and loader programs use special object file for- 


matSe 


Directories provide the mapoing between the names of 
files and the files themselves. They induce a structure on 
the file system as a whole. A directory behaves exactly 
like an ordinary file except that the system controls its 
format and contents. Each system user has a directory asso= 
ciated with his user name and he may create sub-directories 
to organize collections of his files. The system has 
several directories which it maintains for its own use. One 
is the root directory. The directories in a file system 
form a tree and the root is the base of this tree. Thus, 
any file in the system can be located bv tracina a oath from 
the root throuch the aoproopriate directories. Another sys- 
tem directory contains all the orogqrams which are used as 
system commands and is special only in that certain proarams 


"know" its name. 


Each directory must aopear as an entry in exactly 
one other directory called its oarent. Each directory has 


two special entries. These are the name "." which refers to 
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the directory itself and the name oe which refers to the 
parent directory. These entries enable reference to the 
directory and itsS nmarent without knowing the name expli- 


ert ly. 


File names are strinas of 14 or fewer characters. 
Identification of a file to the system is accomplished 
through a strina of directory names separated by virgules 
Gey") and terminated by the file name desired. This string 
is called a path name. When the path name is started with a 
virgqule, the system begins the path search at the root 
directory, otherwise it starts at the user's current working 
Girectory. For example, the path name 
"/foxtrot/uniform/charlie” would cause the system to. start 
at the root, search for directory "foxtrot", search "fox- 
trot" for directory "uniform" and find file "charlie" in 
"uniform". The file "charlie" could be any type file, 
mcludinag a directory. In another caser the pathname "kilo" 
would cause the system to search the user's current direc= 
tory for "kilo". The path name "/" prefers to the root 


itself. 


Special files provide the means of handling I/0 dev- 
ices. Each device supported by UNIX, including communica= 
tions lines and main memory,r is associated with one or more 
special -files. These files can be read or written in the 
Same manner as ordinary files except that the result is the 
activation of the approoriate device. A411 soecial files 


reside in directory "/dev". 
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The access control or protection scheme in UNIX is 
relatively simole. Each user known to the system has a 
unique user number called the user id. When a file is 
created, the aporopriate user id is associated with it and 
bits are set in the directory entry indicating which users 
have permission to read, write or execute the file. A 
facility 1S provided for executable files called seteruser-id 
whereby when the files are executed the resulting orocess 
assumes the user id of the owner of the executable file. 
This enable a system program executed by a user to access 
files which the user cannot directly access himself. Since 
anyone may cause his executable files to use set-user=id, 
this feature is generally available to provide protected 
access to. files. The system recoanizes one user id (the 
"super user") as being free of any access restrictions. The 
major flaw in the UNIX protection scheme is that there 18 no 
way to monitor or lock out simultaneous ovening of a file by 
multiole proarams with access rights to the files. The 
system's authors contend that these features are neither 
necessary nor sufficient for integrity controls (Ref. 1]. 
However, the reasonina behind declarina the features 
unnecessary was that “we are not faced with large single= 


file data bases maintained by independent orocesses". 
a. Input/Output (1/0) Calls. 


Under UNIX, I/0 calls are desianed to eliminate’ the 
difference between the various devices and forms of access. 


The file system organizes al] media space into 512 bvte 
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blocks which are its smallest readable and writable unit. 
Consequently, reads and writes of 512 bytes starting on a 
S512 byte boundary are most efficient. However, no loaical 
record size 1S imoosed by the system, nor is there any dise= 
tinction between random or sequential access. To read or 
write an already existing file, an "oven" call eee be made. 
This system call is oassed a path name and returns a number, 
called a file descriptor, which identifes the open file to 
the system. The file descriptor is used in subsequent I/0 
calls. In order to create and open a file, a "“creat" call 
must be made. This call reauires parameters which specify 
the file name and access mode, and returns a file descrip-= 
ECR. A "creat" on an existina file truncates it to zero 
lenath. An oven file may be accessed via "read" and "write" 
calls. Tie system calls require the file descriotor, the 


location of a read/write buffer and the length of the 


buffer. 


To enable random access of aoprooriate files, the 
"seek" call is provided. This svstem call merely changes 
the read/write pointer associated with an open file. The 
read/write pointer contains the byte offset from the begin- 
ning of the file at which the next access will beaqin. Other 
system calls exist for such file manioulations as closing a 
file, finding the status of a file, changing the protection 
mode or owner of a file, creating or removing a directory, 


making a link to an existina file and deleting a file. 
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3. Processes and Imaaes. 


An image 1S an entire computer execution environment 
including main memory Image, general register valuess the 
status of open files and the identity of the current direc- 
tory. Thus the imaae constitutes a state vector of a pro- 
cess which contains all information necessary to resume exe- 
cution of the process. A process is the execution of an 
image while the virtual machine is imoosed on the hardware 
by the system. The virtual address space of a process is 
divided into three logical seqments: the program text 
(instructions and constants), data and stack. Pure text is 
read only for the user while the data and stack segments may 


expand or contract in sizee 


A process comes into existence through a "fork" call 
executed by another orocess. This system call creates an 
exact duplicate of the image of the calling process. The 
only difference between the orocesses is. that one process is 
considered the oarent and the other the child. Both execute 
as if returning from the "fork" call. The parent receives 
as aoreturn value a number called the orocess idry which 
uniauely identifies the child. 

The child receives zero as its return value. Synchroniza-= 
tion between parent and child is provided by the "wait” 
call. When a process with children executes a "wait", its 
execution 1S suspended until) one of its children terminates. 
The return value of the “wait” is the process id of the ter- 


minated child. Interorocess Communication 1S orovided by 
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thee pioe” call. teyscmcyvstemecall sets Wo a ichanne! which 
can be read or written by any orocess which has as an ancese- 


tor the orocess that executed the "pipe" call. 


The “exec" system call is provided to allow the exe- 
cution of a program (i.e., executable file). The "exec" 
call needs a path name to the file as its argument. A. pro- 
cess executing an "exec" has all] its code, data and stack 
space overlaid by the referenced proaram if the call 
succeeds. Open files, the current directory and interpro- 
cess relationships remain unchanaed. A return from. the 
"exec”™ occurs only if the function is unsuccessful. Termi- 
nation of a process can be accomplished via an “exit" system 
call. When an “exit"™ is executed, the process and associ= 


ated image cease to exist. q 
le The C Lanauaae. 


C 1s the programming lanquaqe primarily used under 
UNIX. Most of UNIX itself is coded in C. C provides modern 
control structures to allow structured GOTO-less coding. 
Its design objectives were to give shorter and clearer code, 
encourage modularity and good oroaram organization and oro- 
vide facilities for many different tyoes of data including 


pointers and character strinas. 


A C program consists of a group of functions (one of 
which must be named "main") and oossibly some external data 
declarations. Parameters may be oassed between fuUNCtIONS 


via call and return arquments or through external data 
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items. C 1s not a block structured lanauage in that funce- 
tions cannot be defined locally to other functions and 
external data names may not be redeclared locally to a func= 
tion. However, the block structured languace feature of 
allowing a qroup of statements to be considered as a single 
statement 1S included. This qrouping 1s accomplished by 


enclosing the statements within "{" and "}", 


The basic data types in C are "int", "“char", 
"float", “double” and "Struct". In addition, arrays of or 
pointers to any of these types can be declared. Items of 
type "int" are {16-bit two's complement integers. Items of 


type "char" are 8=bit values which can be interpreted as 
characters or as two's complement integers. Strings are 
represented as arrays of characters. Items of type "float" 
or "double"™ are binary floating point numbers of length 3e 
and 64 bits repectively. An item of tyne “struct” consists 
of a group of item declarations (possibly includina arrays) 
which can be viewed as a unit. This latter capability pro- 
vides for user definition of a theoretically infinite number 


of data types. 


C provides a large number of binary and unary arith= 
metic and logical ooerators. Arithmetic operations provided 
are addition and subtraction, multiolication and division, 
incrementation and decrementation, and bitswise OR, AND and 
complement. Loaical operators allow exoressions to be com= 
pared, logically AND'‘'ed, logically OR'‘'ed and logically com-= 


plemented. No distinction exists between a loaical 
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expression and an arithmetic expression. Any expression has 


a true value if and only 1 f it evaluates ol = non-zero 


value. 


Assignment statements are provided in C which are 
unusual 1n the following ways. An assignment statement can 
be used aS an expression and has the value that was assigned 
to the variable on the left hand side of the assignment 
statement. A number of assianment operators exist which 
cause a binary operation to take place between the left hand 


side and the evaluated riaht hand side orior to storage of 


the value (e.g. "x =+ 23" adds two to "x"). 


The major control statements in C are “while”, "do- 
Bye, 06 6CUd FOF s,h6C6U"SWItCHh",» "aoto", “break” and “continue”. 
The “while” statements causes execution of a aroup of state- 
ments as long as an expression is true. A "doewhile" state= 
ment is like a “while” exceot that the control expression is 
evaluated after the execution of the group of statements. 
Therefore, the "dowrwhile” statement is always executed at 
least once. The "for" statement is an extension of the 
while which provides control variable initialization and 
loop inerementation. The "switch" statement allows the exe 
cution of one of 3 aroup of statements labeled as cases 
based on the value of an expression. The "goto" statement 
eransters contro] to 3a labeled statement in the usual 


fashion. "Break" and “continue” exist to provide’ for 


label-free loop termination and skippina. 
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A subroutine library is provided for use with C pro- 
grams. It contains system calls for I/0 and other funce 
tions. In addition, it contains routines for formated out- 
put and for the standard functions of analysis. For a more 


complete descriotion of C, see Ref. 10 and 18. 
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Pits IMPLEMENWATION OF THE CODASYL DESIGN. 


a Imolementation Philosophy. 


The overriding consideration in implementing the OBMS 
was to avoid any modifications or additions to the existing 
UNIX facilities. This decision was made for 3 number of 
reasons. First, other research is being conducted in the 
Comouter Laboratory utilizing the UNIX operatina system as a 
research tool. Running systems which require non-standard 
veresions of UNIX interfers with the control environment and 
generally makes other operating system modifications more 
difficult. Second, a modification to the operating system 
must be resapnlied whenever a new release of UNIX is 
installed. Third, the chances of the DBMS heina transported 
to other UNIX sites a far greater if it runs under a stan- 
dard UNIX. Finally, the research goal of determining if the 


DBMS could be implemented in a variety of environments would 


be subverted by modifyina the operating system environment. 


Since the most notable feature in UNIX is the design of 
its file system, it was decided to utilize the file system 
whenever possible rather than acauiring a larae block of 
physical media space and lettina the DBMS manage it. This 
philosophy was exoected to simplify the problem. of maoping 
data to media and thereby reduce the size and complexity of 


the OBMS and insulate the DRMS from chanaes in the hardware. 
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The final guideline was to implement as larae a useful 
subset of the features in the CODASYL design as feasible 
under the above assumptions. Creative extensions to. the 
CODASYL design were avoided since these would tend to 
obscure the research goal of beina able to measure the util-= 
ity of: the CODASYL network model against the INGRESS rela-= 
tional model. Ffforts were directed instead to the realiza- 
tion of the goals of the CODASYL DDLC, which are very ambi- 


tious 1n themselves. 


None of the above assumotions should be taken aS pre= 
cluding the possibility of future modifications to enhance 
the implementation either of UNIX or of the features of the 
CODASYL design. The intent A the imolementatiogn philosoohy 
described herein was to produce a standard CODASYL DBMS rune- 


Ning under a standard UNIX for use as a baseline product. 


Be. Organization of a Data Base. 


Virtually all itnformation about the data base described 
by a particular schema 1S contained im a special directory. 


- 


The only exceptions are certain files which are created for 
the life of a user process and then discarded. The data 
base, its schema and its directory all have the same name. 
Although it 18s oossible for directories with the same name 
to exist in a UNIX file system, no two data bases’ should 
have the same name. The files within the directory associ- 


ated with a data base (called a schema directory) contain 


the source and object schemasSr the schema Data Base Manaaer 
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(DBM) program and alt the non=temoorary data within the data 
base. Specific filles wil} be mentioned when they are 
relevant to the discussion. Apoendix B contains a complete 


listing of the files associated with a schema. 


C. Operating Environment. 


The environment for Bech system maintenance and user 
access of the data base is provided by the DBM Request Pro= 
cessor ("dbm"). This proqram is a aqeneral puroose command 
language processor used to orovide interface with any data 
base. Appendix C contains a description of the functions of 


the DBM Request Processor. 


When a user wishes to execute a program which accesses a 
data base, he executes dbm and specifies the appropriate 
schema name. He then gives dbm an "x" command and specifies 
@€a path name to the user proaram and See etTe to be passed 
Go tne orogram. Dom opens two oices as interprocess commune 
ication channels and forks off two children. Through "exec" 
calls, these orocesses become the schema DBM orogram and the 
requested user oroaqram respectively. Both programs are 
passed the file descriptors of their respective ends of the 
interprocess communication oi10eS as part of their calling 
arguments. The child destined to become the schema 028M 
changes directories (“chdir") to the schema directory orior 


to executina the schema DBM, 
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Once the user and the schema DBM orogram are esta- 
blished, dbm waits until they have terminated before accept- 
iMG any more commands. Durina execution, the user program 
and the schema DBM pass reauests and data through the two 
pipes with the user proaram executing in the user's) working 
directory and the schema DBM executing in the schema direc= 
tory. It is possible for other users to execute con- 
currently usina the same data bases however, each user has 


his Own version of the schema DRM and 4 separate set of 


interprocess pipes. 


This operatina environment differs from the one 
envisioned by the CODASYL DOLC in that each user process is 
interfaced to its own copy of the DBMS routines. Each copy 
has itS own buffers and no knowledge of the existence of 
other copiesr except that which it can derive from the state 
of files within the schema directory. In contrast, the 
CODASYL designers described implementation of a sinale copy 
of the DBMS routines which would concurrently communicate 
with all the users and have communal system buffers for ser- 
vicina all user reauests (Ref. 2]. The reasons for this 


difference are twofold. 
| 


Firstr even though pioes are the only reasonable method 
for interprocess communication, they are limited in that two 
processes may communicate only via a oipe ooened by a common 
ancestor. In general, the only common ancestor of orocesses 
spawned by different oe is UNIX ander although the mechan= 


ism exists for finaing the orocess id of a erocess ("os"), 
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no mechanism exists for reauestina UNIX to open a pipe to a 


designated orocess. 


Second, even 1f a mechanism existed to connect a_ single 
copy of the schema DBM to users, that single copy could not 
muster sufficient resources to service them. In varticular, 
@ process may have only fourteen simultaneously ooen files 
and each user would reauire two files open (its vipes) on a 
dedicated basis. Thus, since access to an area requires two 
files to be onen, a sinale schema DBM having several users 
each requiring several areas would develop a thrashing con= 
dition in which almost every access to the data base would 
incur the overhead of two file ovens and two file closes. 
Additionally, a problem with memory buffer contention might 
arise, although this oroblem would probably be less criti 


Gals 


The existence of separate copies of the schema DBM does 
not mean that the proaram must be duplicated in memory for 
each of its current users. UNIX provides a_ facility for 
processes executina the same progqram to share the same text 
segment, thus only the data and stack segments are repli-= 


cated for each oprocesse The conseauences of the multiple 


schema DBM environment will] be discussed later. 
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Pee Source and Object Schemas. 


Nhen a new data base is to be createdr a source schema 
description must be prepared. A schema directory should be 
created (Cusina the UNIX function "“mkdir") to contain the 
source Jlanquage version of the schema. The source schema 
description must reside in a file whose name is formed by 
prefixing the schema name with "s." and which is located in 
the schema directory. The UNIX text editor ("ed") is suite 
able for entering the source schema description. The source 
schema is coded in a modified form of the CODASYL DOL 
described in Ref. 2. Differences between the DDL of Ref. 2 


and the UNIX DBMS ODL are discussed in Appendix G. 


Once the source description of the schema is entered, it 
must be compiled into an object version. This compilation 
is accomplished via the "c" command of the DBM Reauest Prose 
cessor. The object version of the schema consists of two 
files which contain the schema D8M program and the encoded 
schema descriotionsr resoectively. The schema DBM interprets 
and services all user requests for access to the data base, 
while the encoded schema description 1S a comoact symbolic 
form of the schema's structure. The name of the file con- 
taining the schema DBM is the schema name prefixed by 
"dom.". The schema DBM is discussed in Section III.l below. 
The encoded schema cescription file is used to initialize 
the schema DBM program and for information about the data 
base during the move and aarbage collection functions of the 


DBM Request Processor. Aopendix D contains a description of 


van 
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the format of the schema description fl ee 


Er. Interorocess Communication. 


The schema DBM ana the user process communicate via the 
pipes set up for them by dom. These pipes may be read and 
written just as if they were ordinary ooen files. Messages 
of a oredefined format are sent and received by both 
processes. The first messaae sent is the initial call mese= 
sage from the user process. This message 1s triggered by 
the C DML "permit" function and contains an encoded descrip- 
tion of the sub=schema. The schema DBM response to the ini-= 
tial call includes the index numbers for all the entities 
and attributes contained in the sub-schema description. 
Subseauent user oroaqram messages are reauests for data 
retrieval or uodate and are made utilizina the index numbers 
acquired in the initial call. Since the schema DBM will 
receive an end of file condition when trying to read the 
interorocess channel! after user terminations, no indication 
need be given to the schema DBM that the user program has 


terminated. 


Messages sent by the schema DBM fal] into two 
categories: normal responses and error messages. The error 
codes in error messaaes correspond to those used by the C 
DML. For a description of the format of all the interpro- 


cess messages see Aopendix E. 
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F. Data Base Keys. 


In the CODASYL DBMS each record must be identified by a 
unique value called its data base key. This key is assigned 
when the record is created and remains with it for the life 
of the record. The ability to map a record's data base key 
to the record in a auick and unambiguous fashion must. be 
provided since the key is used for direct access. The key's 
order relative to all other keys in the area must be well 
G@erined since rt is used for seauential access. However, 
the record is allowed to move around in physical media space 
as lona as it stays within the same area. Section III.G 
below will discuss how the oroblem of satisfying al} the 


criteria for data base keys was resolved. 


The format of a data base key is shown in Fig. 1. This 
format makes possible 255 areas (area zero is the null area) 


each containing up to 16,777,215 records. 


Bits s Sal C4 23 ) 


Fields: ' area # ! record # in area |; 


Data Base Key Format. 


Pale en la. 


Nhen a record is first created and assianed to an area, 
that area's index number becomes part of its data base key. 


It is not possible, therefore, for a record to migrate to 
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another area. The record number is a ourely logical order= 


ing and is implemented as described below. 


Gee ncea Handling. 


Each area specified in the schema has associated with it 
two files. The first of these is the file containing the 
data stored in the area. Its name is the same as the name 
of the area. The second file is the data base key file for 


the area. Its name is the area name prefixed by "ke". 


The data base key file is organized into 24-bit entries. 
Each entry contains the startina byte offset into the area 
data file of the record associated with a particular data 
base key for that area. The data hase keys are maoped to 
the entries sequentially. That is, hen AS the record 
offset portion of the data base key by three yields the 
Starting byte offset-of the entry tin the key file associated 
with that data base key. If the value of the key file entry 
for a data base key iS zero then that key is null (1).e. 
unassiqned). The first entry slot in the data base key file 
for each area is reserved for storing the highest used key 
in the area to facilitate sequential searching of an area. 


Thus record number 2ero is undefined in each area. 


Data records are stored in the data file with ae three 
word orefix. The first three bytes of the orefix contain 
the record number oortion of the data ovoase key for that 


SC'CORG. The fourth byte contains the record type (a total! 
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of 255 types are possible). The last two bytes of the pre- 
fix contain the size in bytes of the record (maximum record 


size is therefore 32767 bytes). 


As records are created and added to the arear they are 
entered seauentially in the area data file following the 
last record written (data base keys may be assigned by any 
algorithm, however). The records will remain in their ori- 
ginal locations until they are deleted or moved during gare 
bage collection. If a record is moved to a new location, 
its key file entry is updated accordingly. Whenever a 
record is moved or deleted from the area data file, the 
first two words of the orefix at its former location are 


zeroed. 


The positioning control mechanism provided in the schema 
DDL is implemented via data base keys. The area control is 
handled in a trivial fashion since the area tndex number is 
@ part of the data base key. The positionina of a record 
"near" another record is accomplished by assigning the 
record being added the next available data base key follow- 
ing the data base key of the record it i318 to be "“near". 
This method speeds access to records clustered “near" one 
another when they are used in conjunction with each another 
Since their data base key file entries are likely to be in 
the same block. Additionally, the aarbage collection func 
tion of the DBM Request Processor automatically re-seauences 
the records in the area data file to he in ascending order 


of data base key. After garbage collection the records are 
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clustered in the area data file as well. 


If a data base key assianment algorithm causes a sparse 
key space, the storage needed for recordina the key entries 
is minimized by the fact that UNIX only allocates storage 
for blocks actually accessed. For example, if a data base 
key were to be allocated whose key entry block would be 200 
blocks beyond the current end of the data base key file, 
only the block containing that entry would be allocated. 
Even though the apparent size of the file would have 
increased by 200 blocks, the intervenina 199 blocks would 
not be assiaqned any physical media space. Unfortunately, if 
an empty block is read, space for it 1s allocated. This 
means that if the area in the above example were ever 
scanned sequentially, al] the non-allocated blocks in the 


data base key file would be allocated. 


Due to the deletion and addition alaorithms, gaps will 
develop in the data file during the course of processing. 
The total size (in bytes) of these gaps is maintained in the 
first four bytes of the area data file. During the execu 
tion of the schema DBM, the amount of wasted space iS accu- 
mulated and at the end of the run the area data file waste 
count is incremented. Since this method oermits more than 
two billion waste bytes to he accumulated, no provision is 
made for overflowina the waste count. When the waste count 
gets to an unaccentable size, the DBM Reauest Processor can 


be used to effect garbaae collection. 
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Areas which are designated as temporary areas afe han- 
dled in a sliahtly different manner. Since a temoorary area 
is local to the user process opening ity the file names for 
such an area are suffixed with the process id of the user 
process. Since the process id uniquely identifies the pro- 
cess, these names uniquely identify a oarticular version of 
a temporary area. Additionallyr the files associated with a 


"directory and are 


temporary area are allocated in the “tmp 
deleted when the orocess is terminated. The “tmp” directory 
has the characteristic that if a system crash occurs,s the 


files within it are lost. 


The files associated with any area are automatically 
created by the schema DBM if it attempts to open them and 
they do not exist. This means that when a schema is first 
created, its areas wil! come into being automatically as 


soon as they are needed. 


H. Aecess Methods. 


There are five access methods which may be used for 
locating a record in the data base: direct, sequential, 


calculatedrs chained and indexed. 
1. Direct Access. 


If the data base key of a record is known, it may be 
accessed directly usina the data base key mappina mechanism 
described above. Every access to the data base ultimately 


involves direct access once the data base key is Known. 
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Unadorned direct access is provided to the user. through 
record currency and through exolicit key record selection 


expressions (see Section IV below). 
Ce Sequential Area Scan. 


The “next" and "prior" records in an area are 
accessed through a seauential scan. The algorithm succes> 
sively increments or decrements the record number inthe 
current data base key until the next or previous non-nul] 


data base key is found. 
3. Calculated or Hashed Access. 


A data base key may be develooed by a hashing alao- 
rithm which uses data in a record for a hash key. The 
schema record entry for records accessed by hashina must 
have a location mode of "CALC". During record creation, 
CALC key collisions are resolved by a forward linear scan 
until a null kev is developed. A key link is established in 
the synonym record to enable future access to the New 
record. If multiple collisions occur on the same data base 
keys, a linked list is developed leading to the last synonyn 
added. A standard schema DBM utility routine is used for 
hashing ("randkey"). If non-standard hashing for a record 
is desired, a data base procedure may be specified in the 
location mode clause of the record's schema entry. When a 
record of a type usina a non-standard hashing procedure is 
to be added, the data base procedure declared in the loca-= 


tion mode clause for the record will be called to poravide 
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the data base key. 
4. Chained Access. 


The default method of set linkage 15 via chainine. 
The links in a chain consist of data base keys stored in the 
records to be linked. The owner record of a set contains 
links to the first and last member records in the set. Each 
member record contains a link to the next record in the set. 
If a set is defined as "PRIOR PROCESSABLE" in the schema, 
each member record has a link to the previous member of the 
set. A member record defined as "LINKED TO OWNER" in the 
schema will contain a link to the owner record of the set. 
The link=to-next=record in the last record of a set and the 
link=to-previous-record in the first record of a set both 


point to the owner record of the set. 
5S. Indexed Access. 


Sets which are singular or dynamic have indices. as 
their primary access method. Additionally» indices are used 
for secondary set linkaqe to implement "SEARCH" keys defined 
in the schema for a record tyoce. An index consists of a 
list of data base keys ordered as specified in an “ORDER” or 
"SEARCH" clause. When record selection is through set 
membershio with data field values soecified (see Apoendix A, 
Section 8.1!.ea), an approoriate index will be used to again 
access to the record. See Section III.I below for a further 


discussion of indices. 
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ie The Schema Index File, 


All indices created in the data base are stored in the 
schema index file. The name of this file is the schema name 
prefixed by “index.". Each index in the file is oraanized 
into S12 byte blocks each of which has the format shown in 


miro. Ce 


struct iblock{ 


int blink; // Vink to orevious index block 
int iflinks; // Vink to next index block 
char ientry(508); // up to 127 index entries 

} 


Format of an Index Biloek. 


Figure e. 


When the backward link field Ciblink) im the first block 
of an index is  zeror the index iS not in use. Minus one 
indicates that it iS in use. The forward link field 
Ciflink) of the last block in the index is zero. The entry 
array (ientry) contains uo to 1ied/vs four byte data base keys. 
Null entries at the end of an index block are all zeroes. 
Nhen an index is first created, seven empty entry slots are 


left at the end of each block for future arowth. 


An index is searched by a modified form of binary 
search. When a record is to be located via the index, the 
first index block is retrieved and the records correspondina 
to the first and last data base keys entered in the tiock 


are examined. Tf these records bracket the desired record, 
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a binary search is conducted through the index block to find 
the desired record. If the desired record is not associated 
with the index block, subsequent index blocks are read and 
the last record for each block is examined to determine § if 
it brackets the desired record. When the correct block is 
founds a binary search of that block is used to find the 
pecora. For an index with k blocks havina an average of n 
entries in each block, the average number of records exam= 
ined in locating a record is aoproximately (k / 2) + m, 


where m is the loa base two of n. 


When data base keys are added to the last block of an 
index, a new last index block is created whenever six or 
fewer empty slots remain in the block. iifja.s (block other 
than the last block overflows while a data base key 1s heinga 
added, a new block is inserted into the sequence of blocks. 
hhewemast seven entries of the old block are copied into the 
new block and the new data base key entry-1s added. When= 
ever the last key remaining in a block is deleted, the block 


i$ removed from the index and freed. 


Indices are used for set. linkage as well as to facili- 
tate the maintenance of a "NO DUPLICATES" clause refering to 
items not linked in another way. Indices are linked to the 
records when needed by storing the startina block number of 


the index in the record. 


In the absence of P and V ooerators (Ref. eels an index 


may be reserved for use by a particular copy of the schema 
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DBM in the following tortuous manner. When an index is¢6—C Cito 
be accessed, the schema DBM attempts to create a file 
("iIndexdum") in the schema directory. This file is created 


with 2a =mode that does not allow writing, therefore should 


another process attempt to create "indexdum" while it is 
open, the attemot wil] fail. T f acreate fails, it is 
repeated until successful. Once the “jindexdum" file has 


been created, the first block of the index is read from the 
index file. If the backward link field of the first block 
is minus one, “indexdum" is closed and the above process is 
repeated until the backward link is zero. The backward link 
is then set to minus one and the block written. The “index= 
dum" file is then destroyed. If an index is not available, 
the schema DBM releases any indices allocated in order to 


avoid deadlocks. 


When an index is to be released, the backward link of 
the first index block is set to zero and the block is 
rewritten. Thusr only one schema DBM can use a particular 
index at any given time. This svstem avoids intecrity prob- 
lems stemming from simultaneous update of an index block by 


two different copies of the schema DBM, 


Durina the course of day-to-rday operations, indices will 
be created as well as discarded. When an index is of no 
further use, its blocks must be made available for precylc= 
GOs The free blocks thus created, are accounted for on a 
free block list similar to the free list in a UNIX super 


block. Block zero of the schema index file is used to 
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contain the free list. The first word of Balko cik zero cone 
tains the block number of a free list block or zero if none 
exist. The remaining 255 words are used to store the block 


numbers of free blocks. 


The free list is maintained as follows. When a block 
must be added to the free list, its block number is stored 
in the first available slot in the free block. If all the 
slots are taken, block zero is cooied into the free block; 
the block mumber of this free block is stored in the first 
word of block zeros; and the remainder of block zero is 
filled with zeroes. If a free block must be allocated to an 
imdexs, the last none=nul] oat mumber is extracted from 
block zero and the slot is cleared. If no free blocks are 
on the list and word zero contains a block number, that 
block is allocated to the index after first copying its con- 
tents into block zero. If block zero is all zeroes, a new 
block is added to the end of the file for use in the index. 
Other schema DBM processes are locked out durina block 
aquisition and freeine by the same mechanism used to gain 


control] of an index. 


J. Privacy. 


The CODASYL DBMS design allows for orivacy locks to be 
established at ail levels. It allows separate orivacy locks 
for each function on a resource. Additionally, it allows a 
privacy lock to be either a strina to be matched or a lock 


orocedure. The UNIX imolementation features” all these 
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Gptions. Their implementation is accomplished as follows. 


When the files in the schema directory are created, UNIX 
establishes the installation's Data Base Administrator as 
their owner. By making the access orivileges of a file read 
and write for owner only CUNIX function "chmod"), the Data 
Base Administrator can prohibit all other users in the syse- 
tem from openina the file. ThusS-e only the Oata Base 
Administrator (Cor super user) can directly read or write a 


schema file. 


The DBM Request Processor may be used for both system 
maintenance and to initiate user execution. However, since 
the DBM Request Processor does not setruser-id, it can only 
perform system maintenance functions when used by the Data 


Base Administrator. 


Since the schema DBM program file is executable by = any 
user, the DBM Request Processor can initiate it to orocess 
user requests. Since the schema DBM does a setruser=id_ to 
the user id of the Data Base Administrator, it can access 
the schema files as required. A user must not be able to 
penetrate the schema DBM to gain access to information for 


which he does not have the orivacy keys. 


The schema DBM orevents unauthorized access using the 
following orocedure. The schema DRM has a privacy flag for 
every function/resource pair for which a privacy lock can be 
defined. When the schema DBM validates the user's sub- 


schema (contained in the initial call message), ee checks 
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the privacy keys defined in the sub=schema. Each privacy 
pag for whitch no lock is defined or for which the sub- 
schema privacy key 1S valid, is set to allow access, othere= 
wise it is set to deny access. The user receives no immedi 
ate indication as to whether or not his privacy keys fit the 
locks. If he later tries to access some data base resource 
in a way for which he did not furnish acceptable privacy 
keys, his request fails. Once an initial response message 
is accepted by the schema DBM, no further unlocking of 
resources can be done. Thus, in order to access the denied 
resource, the user proaram must be terminated and restarted 
using a fresh copy of the schema DBM which must be provided 
the proper orivacy keys. Thus, no single execution of a 
user program cans, through trial and error, determine the 


valid privacy keys. 


Kos Integrity. 


As previously mentioned, the CODASYL DDLC envisioned 
that the DBMS routines would be contained in a single pro- 
cess which would service all users. That concept guarantees 
the integrity of the data base since simultaneous update o f 
the data base is imoossible. Additionallys a conceot called 
“keeo" status is included in the COBOL DML (Ref. 3]. A 
record has automatic "keen" status for a runseunit during the 


time it is the current record of that runwtunit. A runeunit 


ae 


can also reauest keep" status for a record if it desires to 


be informed § of what haooens to the record. If a runeunit 
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modifies or deletes a record which has "“"keeo"“ status. for 
another runerunits, the runsunit having the record in “keep" 
status will be notified of the action. Although the “keep” 
mechanism does not resolve the problem of concurrent update, 
it does provide a mechanism for identifying potential orob-= 
lems. "Keep" status allows run-units to update the data 


base while still allowing access to it. 


Since each user process is coupled to its own version of 
the schema NBM, none of the above features can be readily 
implemented. As a consequence, if multiole schema DBM proe 
grams concurrently open an area for update, data base 
integrity problems are virtually assured. If, however, all 
users open any area to be updated for protected or exclusive 
use (see Appendix A, Section B.2.6), no integrity problems 


can arise. 


An area opened for protected use cannot be onened by 
another process for update. An area opened for exclusive 
use cannot be opened at all by another process. The mechan-= 
ism for insuring that these rules are enforced is the Loais 
cal Usage Block File. This file resides in the “tmo" direc- 
tory and has the same name as the schema (hence the rule 
that no two data bases may have same name). I[t contains the 
logical usage block which records the openina mode of every 
area currently open bv any copy of the schema DRM, When a 
schema DBM desires to oven an areas, it reads the loaical 


usaae block ana determines whether or not a conflict exists 


between the ocoenina mode it desires and the modes in use by 
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other processes having the area opren. I f no conflict 
exists, it opens the area and updates the logical usage 
block accordingly: otherwise, it notifies the user that the 


open has failed. 


In order to avoid problems with simultaneous update of 
the logical usaqe block bv different processes, a lock out 
file mechanism similar to “indexdum" is employed. This file 
1S named "“opendum"™ and resides in the schema directory. The 
"opendum" file 1s created to lock out other processes while 


the logical usage block is beina accessed. 


L. The Schema DBM. 


As mentioned earlier, the DBMS comoriler must oroduce a 
schema OBM program when a schema 1s compiled. The schema 
DBM proaram is composed of two parts: the schema constants 


and the DBM skeleton. 
1. Schema Constants. 


The DBMS compiler must produce aC coded temporary 
file which contains all the schema uniaue constants necces- 
Sary to tailor the DBM skeleton to the schema being com- 
piled. These constants cause the various buffers and arrays 
used by the DBM proaram to be allocated sufficient memory to 
handle the schema. For a descriotion of the constants and 


arrays involved see Appendix H. o 
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Additionally, this temporary file must include the 
initialization for the arrays “Pprocpoint” and "“procname”’. 
These arrays contain, resoectively, pointers to and the 
names of all the data base procedures in the schema. When- 
ever a data base procedure name is encountered during the 
initialization phase of the schema DRM, "“procname" is 
searched until the matching name is found and a pointer to 


the data base procedure is extracted from the correspondina 


"sSrocooint”™ entry. 


When the constant file has been generated, the DBMS 
compiler can use the C compiler to form the schema DBM from 
the constant file and skeleton DBM. Since pointers to the 
data base procedures are used as initializing constants in 
"“Drocpoint", all the data base procedures will automatically 
be loaded into the output object module after compilation. 
Both the DBM skeleton and the data base procedures’ must 
exist as object modules available to the C comoiler. All 
the external arrays which are dimensioned in the constant 
file are declared but not exnlicitly dimensioned in the 
skeleton DBM. The finished product of the C compiler 1s_— an 


executable schema DRM, 
Cs The DBM Skeleton. 


The DBM skeleton is an object module which contains 
all the DBMS routines (exceot data base procedures) required 
to provide user services for the data base. Aopendix F con- 


tains a complete description of the DBM skeleton. The pro- 
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cessing of the skeleton (and thus the schema dbm) is divided 


Into two phases: initialization and user request processing. 
ae Initialization Phase. 


When the schema DBM is called, it has no infor- 
mation about the oraanization of the schema or sub=schema. 
Although all its buffers are the right size and al) the 
necessary data base procedure are compiled into its it has 
no knowledge of data base names, orivacy locks, set rela- 
tionships or anv other data oweculiar to the schema. In 
enger to functions it must read in all the data inthe 
schema description file. Concurrently,s 1t orocesses the 
user orogram's sub*schema which iS oassed in the user 
proaram's initial call messaae. By validating the sub-= 
schema while initializing the schema, ihe schema ORM can 
immediately translate al] references into terms of its 
internal index numbers rather than store data base names. 
This avoids much of the matching overhead for each user 
request. If the sub-schema fails the validation, the schema 
DBM sends the user Proaqram an error messager returns to the 
beginning of the schema description file and restarts ini- 
tialization. The initialization phase will thus be tere 
minated only if the user proaram either submits ae valid 


sub=schema or terminates. 
bd. User Reauest Servicing Phase. 


After = successful initialization, the user 


request servicing phase beaqins. This ohase consists of a 
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loop which reads a user messaqe, orocesses ity and returns a 
response to the user program. The looo runs until user proe 


gram termination, at which point the schema DBM terminates. 


Processing a user message is accomplished by 
selecting a service routine based on the message type. One 
service routine exists for each messaqe type except the inie 
tial call, with an additional routine to process invalid 
message tyoes. During this phaser an initial call is cone 
sidered an invalid message. Each service routine uses one 
or more utility routines. Utility routines are general pure 
pose data base access and maintenance primitives which may 


be used by several service routines. 
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TV. DESTGN OF THE C DDL AND OML., 


A. Design Goals and Decisions. 


The augmentation to the C language was designed to pro- 
vide a natural interface between the C languaae and the DORMS 
without reducing its ability to support a COBOL DDL and OML,. 
Accordingly, the C OML was desiaqned to have as much funce- 
tional similarity to the COBOL DML as was’ feasible. This 
goal was adopted to support the research objective of test- 
ing CODASYL's contention that the OBMS could support a 
variety of sub=-schema DDL's and DML's. See Section C helow 


for a comparison of the COBOL and C DDL and DML. 


One of the desirable aoals of a OBMS is to provide pro-e- 
gram independence from the definition of the data base. 
Additionallyr one of the orimary desian philosophies of C 
was economy of exoression. In order to facilitate both of 
these goals, the C ODL orovides for describing only a 
minimal subset of the relationshios and restrictions which 
appear within a schema description. The DDL is restricted 
to describing the names and orivacy keys for areas, records, 
data items, data aaaregates and setss and the membershio of 
record types in set types. Data tyoes of items and aggre 
gates must be specified as well, but this information may be 
different from the data tvoes recorded in the schema 


description. Additionally, the DDL need only describe those 
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portions of the schema the program is interested in maniou= 
faerng. Since this information is the only data absolutely 
necessary to the DML, unless major changes affecting the 
validity of the program logic occur tn the schema, a recom= 
pilation of the proaram should seldom be needed. The pro- 
grammer obviously needs to know a lot more about the schema, 
however he can (and should) obtain this information from an 


installation Data Element Dictionary (Ref. 19]. 


AU reurth aoal was to intearate the C DDL and PML into 
the host languaae'’s structure whenever possible. Accord= 
ingly» the DDL was arouped into a snecial external function 
and the OML functions have been gQiven formats similar to 
other C special functions such as "return". DML loaical 


expressions are comoatible with normal C expressions. 


B. Major Conceots. 


This section describes some of the concepts essential to 
the implementation and use of the C DOL and DML. For a come 


plete definition of the C DNL and DML, see Apnendix A. 


1. Currency. 


The concept of currency 18 central to the navigating 
of access paths in the DML. The user process as well as 
each record, set and area tvpoe known to it have a current 
record associated with them. This currency is established 
by the execution of a “find”. When a record is found, it 


becomes the current record of the omrocesss, 1tS record type, 
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the area in which it resides and the set tyne of every set 
occurence the record participates in as a member or owner, 
The current set occurence of each set type 1s the set in 


which the current record of that set type participates. 


-— 


Information about a record, including data values if 
the record was fetched by “aet", continues to be available 
unti}) the record is replaced as the current record every= 
where its currency was originally established. For example, 
“if a record is the current record of a particular area, it 
wil] remain available as the current record of that area 
until a "find" is executed which selects a different record 


residing in the same area. 


Ce. Find versus Get. 


| + 


The difference between the "find" and get func 
tions is that the former locates records in the data base 
while the latter extracts data values from the data base. 
When a "“find" is executed, the DBMS spans the access oaths 
specified in the record selection criterion and returns” all 
the information about a record necessary to make it current 
In the aporopriate olaces. For a description of the record 
selection options available, see Aopendix Ar Section en ale < 
The information necessary to establish currency includes the 
selected record's data base key, record type and the set 
types for the sets it participates in as a member or owner. 


The area the record resides in can be derived from its data 


base key. 
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A “get" is used to access the data values associated 
with a record. The record must have been made the current 
record of the process prior to executing the "“aet". The 
values of the record's data items are available via pointers 
associated with each entity for which the record is current. 
Whether an jimolementor elects to provide separate buffers 


for each currency type or merely reassign the value of 


pointers 15 immaterial. 
Ss Independence of Schema and Sub-schema. 


A user program can be compiled without reference to 
the schema description. A program usina a sub=schema wie 
continue to execute until the data base name, privacy locks 
or set memberships described in the sub-schema aneenacnaee 
or deleted in the schema. However, changina entries such as 
record location modes and set selection clauses in the 


schema may alter the loaic of a program. 


The data types of the sub-schema may differ from 
those of the schema. The NBMS will automatically convert 
data to the tyoes desired by the subw=schema before delivery. 
Some type differences,s howevers may cause an error if the 
data involved is incompatable. For examole,s, converting a 
string of characters to an integer will fail if the string 


contains any nonenumeric characters. 
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C. Comparison with the COBOL DOL and DML. 


The comparision will be made by means of an illustrative 
example. For a detailed description of the C and COBOL DDL 
and DML see Aopendix 4A and Ref. 3 respectively. The COBOL 
portion of the example is patterned aftef Ref. 20. Several 
modifications were made to reflect both recent changes in 
the definition of the CODASYL DBMS and UNIX implementation 


data tyoes. 


The example concerns the representation of a personnel 
data base. Figure 3 is a diaaram of the network and records 
involved. In Fig. 3, the rectangles represent records and 
the arrows ooint from set owners to set members. Record 
names are written above rectangles, item values within. Set 
names are superimposed over the set linkage arrows. Multio 
ple agents may be linked to an assianment and multiple 
assignments to an aaent. “The fact that a particular agent 
is assianed to a particular assignment is ere by the 
existence of a LINK record which has membershio in both the 
AGENT\=LINK set owned by that agent's AGENT record and the 
ASSIGNMENT*=LINK set owned by that assianment's ASSIGNMENT 
record. The link records are made necessary by the restrice- 
tiom that a record may only be a member of one set of a 


given type. 


Figures 4 and 5 show the DDL description of the schema. 
The COBOL sub-schema could be essentially an exact copy of 


the schema. The C sub-schema is shown in Fia. 6. Note that 
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an agent's number is represented as a numeric character 
string im the schema and as an integer in the C sub@schema. 
In the C sub-schema, all the data base names are spelled 
with lower case letters and with hyohen ("=") replaced by 
underscore ("e"). The DBMS translates identifiers to allow 
for this difference in spelling conventions. In the UNIX 
imolementation, data base names spelled with any combina-= 
tions of upper and lower case letters will be properly 


recognized. 


Deoartment 


t Function’ esoionage | 


: Head: M ' 


deoartmentwaacent 


Assiqnment 


aaa 2 °° : Names: Goldfinger |: 


' Numbers 007 : 


' Name: Thunderball ; 
agentewlink 


agenteskil | assiqnmentolink 






link 


skill 


' Names: spy ;: 
fLevels 1 ‘ 





i) 
' Name: lover! ' ' 


‘Level: 10 : ee 


t 
Seeonoeqeqgq@qaq@q@aqq & 


Network Reoresentation of a Data Base. 


Figure 43. 
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01 
01 


01 


01 


01 
01 


01 


SCHEMA NAME IS PERSONNEL@-FILE. 


AREA NAME IS DEPARTMENT©AREA, 
AREA NAME IS ASSIGNMENT#AREA,. 


RECORD NAME IS DEPARTMENT. 

LOCATION MODE IS CALC USING FUNCTION 
DUPLICATES ARE NOT ALLOWED; 

WITHIN DEPARTMENT@APREA, 
mene ier PICTURE IS."AC20)". 
BepOvee tc IURE IS “ACP. 

RECORD NAME IS AGENT; 
LOCATION MODE IS CALC USING NUMBER; 

DUPLICATES ARE NOT ALLOWED; 

WITHIN DEPARTMENT@AREA, 
Bene oreNeaMe, PICTURE “A(10)", 
LAST=NAME; PICTURE "AC10)". 
NUMBihem e ICTURE "“9C 3)". 

RECORD NAME IS SKILL; 
LOCATION MODE IS VIA AGENT=SKILL SET; 
WITHIN AREA OF OWNER, 
NWAME;? PICTURE IS "A(C20)". 
Bevel, Ny¥Pe IS FIXED DECIMAL. 

RECORD NAME IS ASSIGNMENT; 
LOCATION MODE IS CALC USING NAME OF ASSIGNMENT; 
WITHIN AREA OF OWNER. 
meme, FICTURE IS "A(20)". 

RECORD NAME IS LINK; 
LOCATION MODE IS VIA AGENT*LINK SET; 
WITHIN AREA OF OWNER, 


Schema DDL Area and Record Entries 


Figure 4, 
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Sti 


SET 


SET 


SET 


NAME ITS DEPARTMENT @=AGENT; 
OWNER IS DEPARTMENT; 
ORDER IS PERMANENT INSERTION IS 
Sor eDeey DEFINED KEYS; 
MEMBER IS AGENT MANDATORY 
AUTOMATIC LINKED TO OWNER; 
KEY IS ASCENDING NUMBER; 
SET SELECTION IS THRU DEPARTMENT=AGENT OWNER 
MeevTIrIED BY CURRENT OF SET. 
NAME IS AGENT=SKILL; 
OWNER [IS AGENT; 
ORDER IS PERMANENT TNSERTION IS SORTED BY DEFINED KEYS; 
MEMBER IS SKILL MANDATORY AUTOMATIC; 
Meyels BESCENDING LEVEL? 
SET SELECTION IS THRU AGENT SKILL OWNER 
POETte TED BY CURRENT OF SET. 
NAME TS AGENT#LINK; 
OWNER IS AGENT; 
ORDER IS PERMANENT INSERTION IS IMMATERIAL; 
MEMBER ITS LINK MANDATORY AUTOMATIC LINKED TO OWNER; 
SET SELECTION IS THRU AGENT#LINK OWNER IDENTIFIED 
BY CALC=KEY EGUAL TO CURRENT@AGENT. 
NAME ITS ASSIGNMENT#LINK; 
OWNER IS ASSIGNMENT; 
ORDER IS PERMANENT TNSEPTION IS IMMATERTAL; 
MEMBER TS LINK MANDATORY AUTOMATIC LINKED TO QWNER; 
SET SELECTION IS THRU ASSIGNMENT=LINK OWNER 
Poe NP EE IED BY €CALC=-KEY EQUAL TO 
CURRENT=ASSIGNMENT,; 


Schema DDL Set Entries. 


Fiaqure 5. 
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ddi{ 


schema personnelefile; 
area devartment¢area; 
area assianmentef¢area; 
record department { 
Char function(20] 3 
char head[1]}; 
} 
record aaent { 
char firstename[{10) 3 
char lastename(!101;7 
int number; 
} 
record skill { 
char namel[20];3 
int level; 
} 
record assiaqnment { 
char name([20]J; 
} 
record link {} 
} 
Set departmenttagent owner is department { 
memder agent, 
} 
set agenteskil!l owner is aaqent{ 
member skill, 
} 
set aaentelink owner is agent { 
member link, 
} 
set assianmentelink owner is assignment { 
member jlink, 


} 


C Sub-eschema Entries. 


Fioure 6. 
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le Query 1. 


The first query is desianed to extract the skills of 
agent 007. The procedure jis to initialize the agent number, 
FIND the agent and orint the agent number, skill name = and 
skill level for each skill the agent has (if any). The 
COBOL and C realizations of Query 1 are shown jin Fiae 7 and 
8 respectively. In both versions access to the agent 1s via 
the CALC key in the AGENT record and the appropriate AGENT= 
SKILL set 1S automatically selected when the agent 007 
becomes current. The aqent record need not be fetched since 


none of its data fields are needed. 


GPEN ALL. 
FINO*AGENT©RECORD. 
MOVE ‘'007' TO NUMBER OF AGENT. 
Pino AGENT RECORD, 
READ@*FIRST=SKILL. 
BeiOemtieot SKILL RECORD OF AGENT=SKILL SET. 
IF ERROR|=STATUS = 0326 GO TO ALL=DONE. 


BRINT@SKILL. 
Ole 
DISPLAY ‘AGENT = ', NUMBER OF AGENT, ‘, SKILL = ', 
NAME OF SKIIML, ‘', LEVEL = *, LEVEL OF SKILL. 


READ-NEXT=SKILL. 
FIND NEXT SKILL RECORD OF AGENTeSKILL SET. 
IF ERROR=STATUS = 0307 GO TO ALL=DONE, 
Bisa 60 TO PRINT=<SKIME. 

REE=DONE . 


Query 1 Coded in COBOL. 


Figure 7. 
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dbooen(); 
agent.number = 73 
find(aaent); 


for(find(first skill of acenteskilijJsierror.status; ) { 
get(); 


printf("Agent = %s, Skill = %s, Level = %s\n", 
agent .number,skil!l.name,skill.level); 
find(next skill of agenteskil)); 
} 
Query 1 Coded in C. 


Fiaqure 8. 


2. Query ed. 


The second examole query is designed to find a}! 
department heads concerned with the assignment "THUNDER- 
BALL". The procedure is as follows. FIND the ASSIGNMENT 
mocora whose NAME 31s "THUNDERBALL". For each LINK record in 
the assignment's ASSIGNMENT-LINK set, 

. find the link's owner jin the AGENT-LINK set it belongs 
to, 

- find that AGENT record's owner in the DEPARTMENT AGENT 
set it belongs to and 


e print the assiaqnment name and department head. 


The COBOL and C realizations of Query 2 are shown in 
Fige 9 and 10 respectively. In both versions, once the 
desired link is made current, the appropriate DEPARTMENT 
record can be reached via the AGENT#LINK and DEPARTMENT= 
AGENT set. Note that only the DEPARTMENT records need to be 
fetched since all the spannina of access paths 1S accome= 


plished throuah currency information. 
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OPEN SAL. 
FIND=ASSIGNMENTRECORD. 
MOVE 'THUNDERBALL' TO NAME OF ASSIGNMENT. 
FIND ASSIGNMENT RECORD. 
FIND@FIRST@-LINK, 
FIND FIRST LINK RECORD OF ASSIGNMENT=LINK SET. 
IF ERROR=STATUS = 0326 GO TO ALL=DONE. 
FIND-AGENT-LINK=OWNER, 
FIND OWNER RECORD OF AGENT=LINK SET. 
READ=DEPARTMENT=RECORD, 
FIND OWNER RECORD OF DEPARTMENT=AGENT SET. 


GET. 
PRINT=DEPARTMENT=HEAD. 
DISPLAY "ASSIGNMENT = ', NAME OF ASSIGNMENT, 


", HEAD = ', HEAD OF DEPARTMENT. 
FIND=-NEXT#LINK. 
FIND NEXT LINK RECORD OF ASSIGNMENT LINK SET. 
IF ERROR=STATUS = 0307 GO TO ALL=DONE 
ELSE GO TQ FIND-AGENT=LINK=OWNER. 
Ae OONE . 


Query 2 Coded in COBOL. 


Fiaqure 9. 


dbopen(); | 
for(izO071<lesitt+)assianment .name li] ="THUNDERBALL" Ci]; 
find(assianment); 
for(find(first link of assianmentelink)?!error.status;3) { 

findlowner of aaertelink); 

findlowner of departmentteaqent )? 

get(); 

printf ("Assignment = %s, Head = &s Xn", 

assianment.name,department.head); 
find(next link of assiqnmentelink); 
} 


Query e Coded in C. 


Fiqure 10. 
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Ve. CONCLUSTONS AND RECOMENDATIONS, 


meee Conclusions. 


Since the DBMS compiler and the C language augmentation 
are not yet implemented, it is difficult to fully evaluate 
the effectiveness and efficiency of the DBMS. In general, 
it can be said that the UNIX file system seems to be a very 
hospitable environment for developing a OBMS, however the 
operating system facilities of UNIX are not nearly as well 
suited to supoortina this develooment. The DBMS is measured 
against some of the goals of DBMS as they are presented in 


SBe@etion 1.4.4. 


1. Concurrent Retrieval and Unodate. 


The DBMS cannot provide the ability to perform con- 
current uodate of the same area hy two users. Although the 
ability to open an area for unprotected update exists, its 
use can be disasterous. Concurrency between update and 
retrieval in an area causes no intregrity problems; however, 
the user doina retrievals has no way of knowing 1f the 


records he is accessing are being modified. 
2. A Variety of Search Strateqies. 
The DRMS suovorts every form of access path sececi-= 


fied by the CODASYL DDLC. These forms are direct, hashed, 
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sequential and indexed. 
53. Centralized Placement Control. 


Placement control by the DBMS is a purely loaical 
mapping with the UNIX file system providing centralized 


placement control for the data onto physical media. 
G. Device Indevendence, 


Device indeoendence is almost total for any file in 
UNIX. The DBMS (and therefore the user orogram) is unaware 


of either the types or number of devices in the system. 
S- Privacy of Data. 


The complete crivacy mechanism in the CODASYL design 
has been implemented. The DBMS itself should be relatively 
secure. A program could be written to call the schema OBM 
repeatedly and determine a eis key by trial and error, 
but using data base procedure privacy locks which notify a 
security console or terminate the proaram when a violation 
occurs can greatly reduce the effectiveness of trial and 


error “lock picking". 


UNIX itself, however, 15 too easily penetrated [Ref. 
Elli. Locatinq and plugagina all the holes in UNIX may be 


Impossible. 
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Sn Independence of Schema and Sub-schema. 


The DBMS provides the maximum amount of independence 
possible under the CODASYL desian. In fact, user proarams 
could be comoiled without any reference at all to the 


schema. 


Be. Recomendations. 


1. Enhancement for Concurrent Update. 


In order to enhance the ability for concurrent use 


of a data base, the following aoproaches might be taken. 
ae Centralized Schema DBM. 


UNIX could be modified to provide a mechanism 
for establishing interprocessS Communication to any des Vas 
nated process. This would enable implementation of a cen- 
tralized schema DBM as the CODASYL DDLC intended. This 
alternative remains imoractical for the reasons Section 
III.c, i.ee, the schema DBM would run out of file resources. 


Additionally, the UNIX modification would have an unknown 


but probably major impact on the operatina system's design. 
b. System P and V Call. 


UNIX could be modified to provide a system call 
for P and V onerators. For a discussion of P and V onera- 
tors see Ref. de. If a fast P and V facility were avail- 


able, a schema DBM could temporarily halt all update or 
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access to an area while oerforming modifications. "Keep" 
status could be implemented, if desired, by storing indica- 
tors in the record itself. The impact of such system calls 


on UNIX's design philosophy is expected to be minimal. 


Additionally, existing communications between 
schema DBM programs could be soeeded up. Specifically, the 
methods used with "“ooendum" and "“indexdum" to lock out 
simultaneous undate are essentially a “test and set" opera 
tion which could be implemented more efficiently with P and 


V system calls. 
OC. Enhancement for Faster Access. 


In the absence of usage data, it is difficult to 
estimate the access response soeed of the DBMS. However, a 
logical extension to the access methods provided by the DBMS 
would be multilevel indices. The index structure now in the 
DBMS is essentially an index seauential access scheme which 
could be unpaqraded to the multilevel structure which 18 tynoi- 
cal of such indices. For a disscusion of multilevel index 
sequential access method, see Ref. 13. Use of a two leveled 
index would divide the average number of records scanned to 
find the riaht index block by the average number of entries 


Theecne index biocks. 
5.5 Automatic Gartage Collection. 
Since only the Data Administrator can tnitiate gare 


bage collection, the wasted soace growth rate in a data base 
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may become a problem. Some consideration should be give to 
having the schema DBM automatically garbaae collect when the 
waste 1n an area reaches a critical level. This thesis did 
not address the oroblem of automatic garbage collection due 
to the difficulty of determining what amount of wasted space 


mS Critical. 
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APPENDIX A, C LANGUAGE DDL AND OML. 


A. C Language DDL. 


The DDL in C is desianed to interface the subschema 
description with the schema description with a minimal 
requirment for path information from the user and maximal 
similarity: with existing C language constructs. In the folo 
lowing discussione words enclosed in apostrophes denote 
variable data. Nhen a 'lock' is specified, the data item 
must be of tyne “character pointer". All ‘db identifiers’ 
specified must match the appronriate data base names in the 
schema after translation of lower case into upper case and 
underscore into dash. Ali DDL statements are enclosed in a 
"ddl" routine with the following format: 

COGAG wade limStiaikements <cneat” « 
The ddl routine should apoear prior to any DML statements. 
It may be contained in a file INCLUDE'd at an aporooriate 
point in the orogram. The statements in order of appearance 


are as follows. 
le ochema Entry. 


The "schema" statement identifies the schema name 
and its privacy lock. Its format is 
"schema 'db identifier’ with lock ‘lock';". 


The ‘db identifier’ must match the schema name and the lock 





must match the orivacy lock for the schema entry (see Ref, 


Cr section Se Eo 


ere Area Entries. 


For each area to be used, an area entry must be made. 
These entries must be in the same order as the area entries 
in the schema. The format of an area entry is 

"area ‘db identifier list' lock is ‘lock list's", 
where a ‘db identifer list' is one or more comma. separated 
cdeme ident yrver's. A loek list 18 one or: more comma 
separated lock entries of the form 
""lock' for ‘modifier' PEURG TOR” ° 7 

where the modifier is optional. For an area entry, the 
allowable modifiers are "exclusive" and “protected” sane 
allowable fuctions are “uodate" and "retrieval". Wal ole! 
identifier's and "lock’s must match the area names and, 
privacy locks in the schema area entries (see Ref. ec, seem 


Som Sec. 0). 
1. Record Entries. 


Record entries must be made in the same order as the 
corresponding record entries in the schema. A record entry 
is similar in construction to a C lanauage structure defini- 
tion. Its format 1s 
"macord ‘db identifier' lock is ‘lock list’ {'item list'}s", 


where the “lock is ‘lock list'" phrase 1s optional. The 


"lock list' for records has no modifiers defined. The funce 
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tions allowed are "insert", “ReMOve , —~" store", “= delete, 
*“modify" and "find", The ‘db identifier' and '‘lock's must 
correspond to those of the schema record entry (see Ref. 2, 


section 4.2.3). 


The item list in a record entry is composed of a 

series of item entries of the following form: 

"'type specfier' ‘db identifier' ['constant expres- 

Meme) -1oOCcK is ‘lock list';", 
where the "“('constant expression']" and “lock is ‘lock 
list'” phrases are optional. A type specifier is of the 
foam “int”, wehar", "toate "double", "“dbkey" or 
easeructt item list'?". These data types are identical to 
those of the C language with the addition of "dbkey”. An 
item of type "dbkey" aopears to be an array of four charac- 


ters to the C user. The ‘lock list' for items has no modife- 


" La 


iers defined. The oermissible functions are "store", "get 
and "modify". The item entries may aopear in any order in 
the item entry list with the following restrictions. Items 
muSt appear with the same records as in the schema. The 
data type of the item must be comoatable with the schema 
item. Items of type "Struct" must correspona to repeating 
groups in the schema and have the same dimensionality as in 
the schema. Items appearing in a repeating group in the 
schema must apnoear in the item list of the structure 
corresponding to that repeating aroun. Any ytem in the 


/ 


schema record description may be omitted. 
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The record names can be used in non=-DML statements 
as structures whose format is identical to the record entry. 
These record structures are global Names and contain the 


current record of the respective tyoe. 


ec. Set Entries. 


For each set to be referenced, a set entry of the 

following format must exist: 

meet do 1demtifier’ lock is ‘lock list’ owner is 

"db identifier' x'identifier' {'member list'}:", 
where the “lock is ‘lock list'" ohrase is optional. The 
mrock bist' for sets has no modifiers. The defined func 
tions are "insert", “remove” and "find". The set name ‘db 
identifier’ and '‘lock's must match those of correspondina 
set entry in the schema description and all] set entries must 
be in the same order as in the schema description. The 
second ‘db identifier' must match the owner name of the set. 


The member list 1S composed of one or more member entries. 


3. Member Entries. 


A member entry has the form 
"member ‘db identifier’ *'identifier' 


lock is ‘lock list';", 


where the “lock is ‘Tock Jlist'" phrase is optional. The 
Mlocik rast” for members has no modifiers defined and the 
defined functions are "insert", "remove" and "find". The 


"did identifier’ must be the name of a record defined in the 


schema aS a member recora for the set being described. 
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The *'identifier's mamed in the set and member 
entries become alobal pointers to the appropriate record 
Structure. These pointers can be used to reference the 
current owner record and member record, respectivelys of the 
set. In addition the set name ‘db identifier’ is the name 
of a character array which holds the current record of the 


set. 


C. C Lanauvage NML. 


The DML has several global names and functions associ- 
ated with it. Besides the record, item and set names from 
the ddl routine, there are the pointers "“areaname", "“rec- 
mame" and the structure "error". The “areaname" pointer 
contains the address of the area array containing the 
current record of the process. The “recname" pointer con- 
tains the address of the record structure for the current 
record of the process. Note the "“recname" provides the user 
with the record tvpe of the current record of the process; 
but the current record of the orocess may not be the current 
record of that tyoe and therefore the record pointed to may 
not be the current record of the orocess. The current 
record of the process wil! always be available in the area 
array oointed to by “areaname”". "Areaname" and "recname" 


are set whenever a find or store function 1S executed. The 
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"error" vector is a structure with the following format: 


struct { 


The error codes for C 


patible with error 


DML. 


of four 


Vat status; 


Int type; 


char *set; 


char *record; 


char *area; 


Imt count? 


} error; 


Additionally, 


rumet ion. 


characters 


7 PouncerntG Setwantaye OF error 
// oointer to record for error 


// oointer to area array for error 


DML functions are designed to be come 
codes defined in Ref. 3 for the COBOL 
pointer called "areaid"™ and an array 


called "keyname" exist for the store 


The use of the DML causes certain identifiers to be gen= 


erated 


words 


globally, 


by the user. 


hence these should be treated as reserved 


These reserved words are: 


all area, record and set names 


areaname 


recname 


error 


dbopen 


doclose 


find 


modi fy 


a Der mit »- remove 

- store « emoty 

» member e owner 

» Current - duplicate 
. areaid ~ keyname 

- get - wnsert 

- key 
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1. OML Expressions. 


The DML introduces two additional exoression tyoes 
moto CC. These are DML logical expressions and OML record 


selection expressions. 
ae ODOML Logical Expressions. 


These expressions evaluate to a true/false value 
and can be used in amanner identical to normal C loaical 


expressions. Their forms are 


(1) “*'db identifier’ emoty", where ‘db identife- 
yer’ must aopear as a set name ina set entry. The expres 


sion evaluates true if and only if the current set o f the 


tyoe specified has no members. 


(2) "member of ‘db identifier'",», where ‘db iden= 
tifier’ must be a set name. It evaluates true if and only 
if the current record of the process 18S a member of a set of 


the type specified. 


(3) “owner of ‘db identifier'", where 'db iden- 
facier must be a set name. It evaluates true if and only 
1f the current record of the orocess is the owner of a set 


of the tyoe soecified. 
bw DML Record Selection Exoressions. 


These exoressions result in a data base key 
which can be used to fing a record. They are evaluated in 


part within the user crogram but nust be validated by the 
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schema dbm orogram. They must aoovear only in DML function 


argument lists. The forms possible are as follows. 


({) Exolicit Key. The simplest form of record 
selection expression is by explict key. The format is 
""key'", The ‘key’ must be either an item of type "dbkey"” 
or evaluate to a character nointer. The contents of the 
‘key’ are used as a data base key. This form is useful for 
accessing records nee keys are known. [It can also he used 
for aoplyina currency which has oreviously been suppressed 
(e.g. “"find(key(process))#" applies all appropriate currency 


to the current record of the process). 


(2) Owner Record. Selection of an owner record 
has the format “'db identifier’ owner of '‘key'", where the 


"of ‘key'™”™ ohrase is optional. The ‘db identifier’ is a set 
name and the ‘key’ is an explicit key. If the “of 'key'" 
ohrase is not useds, the owner record of the current instance 
of the set specified is selected? otherwise the owner in the 


set tyoe specified for the record identified by the ‘key’ is 


selected. 


(3) Relative Selection. This form allows. the 
selection of a particular record from an area or set based 
Oma tocation criterion. The expression has the format 
"Yoriterion’ ‘db identifier’ of ‘db identifier'". The first 
‘db identifier’ is ootional and is the name of a record 
tyoe. The second ‘db identifier’ is the name of an area or 


set type. The ‘criterion’ determines the location within 
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the area or set from which the record will be selected. The 
allowed criteria are "next", “orior", “first”, “last™ and an 
expression which evaluates to an inteaer. When the record 
type is included, only occurences of that type record will 
be considered for selection. The ‘criterion’ refers to the 
ordering of the area or set. The ordering of an area is 
considered to be ascendina seauence by data base key. 
"Next" and "prior" are relative to the current record of the 
area or set. If the current record of the set is the owner 
record, "next" and "prior" are eauivalent to "first" and 


"last" respectively. 


(4) CALC Key. If a record type is defined in 
the schema as having a location mode of CALC, the format 
"duolicate ‘db identifier'", where “duplicate” is optional, 
may be used. The 'db identifier’ is a record type defined 
in the schema to have location mode CALC. Pryor to tie 
evaluation of the record selection expressions the items in 
the record designated as oart of the CALC key must have been 


initialized to the desired values. 


Tf the "“"duolicate”™ phrase is included, the 
current record of the process must be of the soecified type 
and have the same CALC kev as in the record buffer. I f 


these conditions are satisfied, a synonym to the current 


record is selected. 





(5) Oata Value. Selection by data value is 


possible using the format "duplicate ‘db identifier’ via 
‘set select' ‘db identifier' == ‘db identifier’ coe’ se where 
the ohrase "se ‘db identifier’ ...", and the word ye fo l[ulfor fei 


cate" js optional. The first ‘db identifier’ is a record 
name; the second, a set name? and the string of ‘db 
‘identifier'’s is made up of items in the named record type. 
The ‘set select' ohrase consists of either the word 
“current” or the format "'db identifier’ ... select". The 
'db identifier' String in the ‘set select’ ohrase is of made 
up of the items needed for the selection path specified in 


the SELECTION clause for the the named record type and named 


set type. 


If the bei “duplicate” is omitted, the 
exoression selects the first record occurence in the 
approoriate set with values matching those of the items in 
the string of ‘db identifier's. af the string iS not seeci= 
fied, the first record of the named tyoe in the set is 
selected. When the list of items is specified, the items in 
the list must have been initialized to the desired values 
prior to evaluation of the record selection exoression. If 
"current" is included, the current instance of the named set 
is used, otherwise the set used is selected on the basis of 
the selection criteria in the schema for the nmamed record 


tyoe as amember of the named set tyoe. 


If the word "“dupnlicate”™ is included, 


"current" must he included and the item list 1s mandatory. 
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The record selected will be the next record in the set 
matching the current record of the process in the fields 


named in the item name strina, 
ee DML Routines. 


The ability to access, retrieve and update records 
1S provided by DML routines. The routines are divided into 
area manipulations (dbonen, dbclose), record manipulations 
(key through delete) and set manipulations (insert, remove). 
The permit function does not fit into any of these 
categories. Consideraole overlao between categories exists 
among the other functions. All functions have the same form 
as normal C subroutine calls. In the following description, 
all error codes listed have a two decimal digit major code 


and a two decimal digit minor code specifying the function 


and specific error respectively. 


a. Permit. The cermit function must be called only 
once and must be before any other DML functions. It causes 
validation of the subschema by the schema DBM program and 
establishment of the privacy permissions reauired. If the 
schema lock is violateds an error code of 0910 is returned 
im error.statuSs If any other privacy lock is failed, no 
indication is given until the user Orogram attempts to use 
the feature not oroperly unlocked. If a mismatch occurs 
between the schema definition and the subschema definition, 
error code 0060 will he returned in error.status. When this 


occurs, error.count will] contain the number of incompatibil= 
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itiess error.tyoe will be 1, 2 or 3 depending on whether the 
first error encountered was in an area, record or set entry; 
error.area,s, error.record and error.set will) indicate the 
first erroneous entry in the area, record and set entries 
respectively. The entry number returned identifies entries 
in the C subschema and is zero if no errors were encoun- 
tered. If any other OML function is attempted prior to per- 


mits error code nn61!1 will be returned, where nn indicates 


the function attemoted. 


b. Dbooen. Prior to processing any records in. an 
area, the user proaram must call dbopen to open the area. 
Dbopen parameters are an opening mode and a list of area 
names. The ovening mode is an octal code formed as follows. 
If the low order bit is 1, the mode is. for update and 
retrieval otherwise it is retrieval only. The next most 
significant two bits are zero for concurrent update per- 
mited; 1 for concurrent retrieval but no concurrent uodate 
(protected mode); and 2 or 3 for no concurrent use vermitted 
(exclusive mode). 41) the areas in the parameter list are 
opened in the specified mode. If no area list is soecified, 
all the areas in the subschema are opened. For all tem= 
porary areas opened, a mode of exclusive update is assumed 
no matter what mode is specified. Modes allowing concurrent 
orocesses to uodate areas are included for compatability 
purposes; however, unless.” the implementation of the data 
base manaaement system is modified, these modes can cause 


severe integrity problems. 
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To successfully execute a finds store, delete or 
closer appropriate areas must be ooen as follows: all areas 
which contain any record occurence which would be deleted or 
removed by a delete statement and all areas which are the 
objective of a close function. If any of these functions 
fails to meet these conditionsr error status nn0! is 


returned, where nn indicates the function attempted. 


In addition to the areas containing the object 
records of the functions cited above, there are additional 
Ci.e. imolicit) areas which could be impacted by DML funce 
[1 ONs « This impact can be of two forms: the DBM program 
requires information contained within the implicit area (Cin 
which case the area must be "available") or the DBM program 
must alter the information contained in records in the 
imolicict area (Cin which case the area must not only be 
available, but it must permit the necessary alteration). 


Imolicit areas reauirina modification are termed “affected". 


A user may assume the following areas will be 
affected: all areas containina any record which participates 
in a set occurence into which a record 1s to be inserted or 
from which a record is to be removed or deleted and al! 
areas containina any records which oarticipate itn any. set 
occurence whose membershipo or seauence is altered by a store 
or modify function. If an imolicit area which is affected 
is not opens error code nn2i. will be returned to 


error.statuS, where nn indicates the function attempted. 
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To successfully execute insert, remove, store, 
delete or modify functionss both explicitly and implicitly 
affected areas involved must be opened for update. If any 
of the involved areas are open for retrieval onlyr an error 
code of nn09 will be returned to error.status where nn indi-= 


cates the function attempted. 


Record occurences which are in the search path 
of a find or an imolicit find which is the result of a 
store, remove or delete function need only be in areas which 
are available. In order for an area to be available, it 
must not be opened for an exclusive mode by a concurrent 
process. Although it need not be oven, the full] overhead of 
a dbopen and dbclose wil) he incurred for each implicit 
reference to an area which is not open. If an implicit area 
is not available, error code nnili& will be returned to 


error.statuSr, where nn indicates the function attempted. 


Any attempt to execute a dbopen function which 
would result ina usaqe mode conflict for any area will 
result in the failure to ocen every area. Additionally, 
error code 0929 will be returned in error.status. A usage 
mode conflict will occur under the followina conditions: any 
mode of unodate on an area opened in an exclusive or pro- 
tected mode bv another processs any orotected mode on. an 
area opened for update by another process; exclusive mode on 
an area onened for any mode by another orocesss and any mode 
on an area ovoened for exclusive use by another orocess. In 


order to prevent deadlock conditions, a process should open 
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all areas needed for exclusive or orotected use in one dboe 
pen. If a dbopen fails because of usage conflict, the pro- 


cess should close any other onen areas obtained previously. 


If a privacy lock is violated, error code Q0910 
is)6©returned in error.status. If an area opened was already 
open, warning error code 0928 is returned in error.status. 
The total number of errors encountered is returned in 


error.count. 
Cis Doclose. - 


When an area is mo longer needed it may be 
released for use by other processes with the dbclose func- 
tion. Dbodclose parameters are a list of area names. All the 
areas in the list are closed. If the oarameter list is 
omitted, all open areas are closed. After the dbclose is 
executed, all current records in closed areas cease to be 
current. If any area named in the parameter list is. not 
openr, error code 0101 18S returned to error.status and 


error.count will contain the the number of errors detected. 


When the process terminates (feven abnormally), 
no dboclose is needed as all areas will be closed. If the 
dbclose function is executed on a temporary area, the data 
within the area is not lost and the area can be reooened and 
processed. When the process terminates, however, all temo 


oorary areas, onen or closed, are lost. 
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\ (3 Find. 


The find function allows the user program to 
select a record from the data base and make j\t the current 
record of the run unit and, selectively, of the appropriate 
record and set types. The povarameters are a record selection 
expression and a suporess' code. The record selection 
expression is discussed in 1.b above. The suppress code is 
an octal code whose least significant bit indicates a 
supression and whose next least sianificant bit indicates 
record suporession. If set suporession is indicated, addi- 
tional find parameters are vermited, each of which is the 


name of set type. 


Execution of a si@aeesnem find function causes 
the selected record to hecome the current record of the pro- 
cess, the area in which it is located, the record type of 
the record and all set tyoes in which it particioates as an 
owner or member record. If record or set sunpression is 
indicated, the object record does not become current for 
these types. When the list of set names is included, 
currency update is suppressed only in the named sets. After 
a find, the data fields of the record are not available, but 
its data base key can be derived (throuah the key function), 
a pointer to its area buffer is in “areaname" and a pointer 


to the aporopriate record tyce structure 1s 1n "“recname", 


The record can now be retrieved bv the aet function. 
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The following error codes may be returned into 
Srror.estatus by a find. 

0301 The sought record is in an area which is not open. 

0318 A record occurence alona the search path of the 
find is in an area under the exclusive control of 
another process. 

0302 A data base key was supplied or developed which is 
incompatable with the areas specified for a record 
of this type. 

0307 An end of area or end of set condition was 
detected. 

0326 No record in the area for selection through CALC 
key satisfies the record selection expression. 

0322 Owner record selection is specified and the data 
base key aiven is for a record which does not pare 
ticipate in a set of the desired tyne. 

0323 Relative selection was specified and the specified 
record cannot be in the desired area. 

0310 <A orivacy breach was attempted. 


0361 No call to the oermit function has been made. 


e. Get. 


The get function is used to transfer the data 
values of the current record of the orocess into the pro- 
cess' buffers. Its parameters, which are optional, are item 
names from some record type. Tf the record type and item 


names are specified, only the items named are extracted. I f 


the item names are not soecified, al! the items defined in 
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the subschema for that record are extracted. A get must be 
executed for a record before any of its item values can be 


examined. 


The following error conditions may be returned 
to error.status for a get. 

0513 The current record for the orocess is unknown. 

0510 A privacy breach was attempted. 

0S20 A record name is specified and the current record 
of the process is not of that type. 

0S6t No call to the permit function has been made. 

0554 Truncation of significance occured during convere= 
Sion from the schema type to the subschema tyoe 
for an item. 

In all but the last case, no data is transfered to the user 


process. 
to Store. 


The store function is used to create a new 
record occurence in the data base. It acquires space and a 
data base key for a new record occurence in the data hase, 
causes the data items in the record's buffer to be used in 
initializina the record, inserts the record into all sets in 
which it 4S an automatic member and establishes a new set 
ocurrence of each set type for which the record is defined 


as an owner in the schema. 


The parameters of the function are @2 record 


name, a sueoress code and one or more set names. The record 
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name specifies the record type to be created. The suppress 
code and set names are exactly analogous to those of the 
find function. In order for the store to function properly, 
the subschema must include the following: the named record; 
the "data-base-identifiers" or set specified in the "LOCA- 
TION" mode clause of the records at least one of the areas 
specified in the within clause for the record; all sets in 
which the record is defined as an automatic member; and al] 
“data-base-identifiers", records and sets specified or 
referenced in the "SELECTION" and "KEY" clauses of the set 
member subentries in which the record is defined as 
automatic (see Ref. 2, section 3.4.0). 

EEIOn £€O callina storpe, it ts the user program's 


responsibility to insure the followina is done. All data 
items in the record type buffer must be initialized. If 
multiple areas are defined in the "WITHIN" clause for the 
record with the "datasbasesdata=name=1" option (see Ref.2, 
weetiom 5.5.9) and the “LOCATION” mode jis not direct, 
"areaiad” must contain the - desired area pointer. If the 
"LOCATION" mode is direct with the “datasbase-datasname-1" 
option, keyname must have the appropriate data base key 
stored in it. If any automatic membershio has a “SELECTION” 
method of "THRU CURRENT", the current record of the set tyoe 
must specify the correct set. All data items mentioned in 
the selection clauses of the member entries which are 


automatic for the record and all data items mentioned in the 


"EOCATION" clause of the record entry must be initialized. 


Ores 





store, 


are changed and an error code 


1201 


baie 


aa 


Wee 


ee 
l2il 


1202 


ve0> 


Hee) 


210 


iae7 


hee 5 


the new record 


If an error occurs during the execution of F) 


1S not createdr no currency indicators 


iS returned to error.status. 


The following errors and codes can be encountered. 


The object record is to be stored in an area which 


1s mot open. 


A record occurence which is affected by the store 
function is tm an area which is not open. 
The object record of the store or some record 


occurence affected by the store is in an area 


which 1S open for retrieval only. 


- 


Some record occurence needed by the store for 


information (e.Ge search oaths) is in an area 


which iS not available. 

No data base keys are available. 

No media soace is available. 

generated 


A data base key passed by the user or 


o 


via a "“CALC™ procedure 1S not valid. 


The record would violate a "DUPLICATES NOT 
ALLOWED" clause defined for one of the records or 
sets involved. 


For one of the set types involved a set occurence 


cannot be matched to the relevant set selection 
criteriae 


A privacy breach was attemoted. 


A check clause applies and one of the data items 
did not pass. 
The area soecified for the record is not one of 
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those in the record's "WITHIN" clause. 

l2ce4 The execution of the store Statement would cause a 
set occurence to have records in both temoorary 
and permanent areas. 

1219 The value of an item cannot be converted to the 


tvpe specified in the schema for that item. 
Se. Modifye 


The modify function enables the updating of some 
or all of the data items defined in the sub-schema for a 
record and the chanaing of set occurences in which a record 
participates. The oarameters, which are all optional, are a 
record name, a list of items in the record and set ovarame- 
ters identical to those of the insert function (see Section 
B.2.j). If the items are not svecified, then every item in 
the record which is known to the sub-schema 1s updated, oth- 
erwise only the named items are undated. If the set names 
are specified, the action taken is eauivalent to a remove 
function followed by an insert function for the named sets 
with the followina exceptions. The record must be in an 
occurence of every set named prior to the modify f UUme tiem. 
The set membershio in the named sets can be defined as man- 


Gaconry or automatic, or both. 


The object of the modify 1s the current record 
o f the process. All data items to be updated and al] items 
required for an insert on the named setsSs must be initial= 


1z2ed0 for the modify. If any of the modified data items are 
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sort control items for a set occurence in which membership 
is retained, the position within the set is modified accord= 
ingly. If any of the items changed are in a "SEARCH" key 
clauser the index is updated. The record becomes the current 


record of its record tyne and all sets it has membership in. 


If an error occurs during a modifys, no data base 
or currency changes are made and an error code is returned 
to error.status. The possible error conditions and the 
associated codes are as follows. 

0803 One of the items changed is in a CALC key and the 
data base key would be altered, or an area number 
specified for owner record selection disagrees 
with the CALC key developed for the owner. 

0825S A set occurence satisfying the specified criteria 
was not found. 

0822 The record is not currently a member of every 
specified set. 

0805 The insertion of the record into a set occurence 
would violate a "DUPLICATES NOT ALLOWED" clause. 

0810 A privacy breach was attempted. 

0827 A check clause was failed. 

0821 Some record occurence affected by the modify is in 
an area which is not open. 

0821 The object record or = some record occurence 
affected by the modify is in an area which is not 
open for update. 


0818 Some record occurence which is implicitly refere 
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enced is jin an area open for exclusive use by 
anOtnerewOorocess . 

0819 A modified data item cannot Se ewaverted into the 
format used by the schema for the item. 

0824 Insertion of the object record into some set 
occurence would cause that set occurence to have 
members in both temporary and permanent areas. 

0861 No call to the permit function has been made. 

In all cases,» no change is made to the data base or to the 


currency indicators of the process. 


he. Key. 


~- 


The key function allows the extraction of the 
data base key for one of the current records. The function 
needs one parameter which may be a record, set or area name 
or the word "process". The function returns a pointer toa 
mreracter array containina the data base key for the current 
record of the inout carameter. The key should be treated as 


read on] Vive 


if Taner ror Occurs during the kev function, Fo) 
null pointer is returned and the error code 1S returned to 
eTrror.status. [The error conditions possible are no current 
record exists for the input parameter passed (code 14506) and 


momGal!l to the permit function (code 1361). 
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ee Delete. 


The delete function is used to destroy the 
current record of the process, releasing its data base key 
and storage, and to selectively delete all of the records 
which are members of set occurences owned by the current 
record of the run unit. The function reauires a sinale 
integer parameter in the ranae zero to three with meaning as 


ed 


follows. A zero parameter causes deletion of the record if 
and only i f re 1s ene. the owner of any nonwempty set 
occurences. If the parameter value is one, the record is 
deleted, all optional members are removed from its set 
occurences and all mandatory members of its set occureneces 
are deleted. If the carameter is two, the action is identi- 
cal to that of one except that, if any of the records whose 
membershio is optional do not participate in set occurences 
owned by a different record, then they are deleted also. If 
the parameter value is three, anen the record and every 
member of its set occurences are deleted. For any member 
record deleted, the deletion of the member records in that 
record's sets is decided as if that record were the object 


of a delete function with an identical parameter as that for 


the oriainally deleted record. 


If an error occurs durina the functions no 
records are removed or deleted and an error code 1s returned 
to error.status. The oossible errors are as follows. 

0230 A delete with oarameter zero was attempted and the 


record owns a nonwemoty set occurence. 
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0213 The current record of the process is unknown. 

0210 A privacy breach was attempted. 

0221 One of the affected member records is in an area 
which is not open. 

0209 The current record of the process or some affected 
record is in an area open for retrieval only. 

0218 An imolicitly referenced record is in. an area 
which is open for the exclusive use of another 
process. 

0208 The sub-schema does not know about all the record 
types which would be deleted or removed, or all of 


the set types of set occurences which would have 


records removed. 
ie Insert. 


This function causes the current record of the 
process to become a member of an occurence of the specified 
set tyoes, providina it is defined as an optional automatic, 
optional manual or mandatory manual member of those sets. 
The parameters, which are optional, are a record tyoe and 
one or more set names. Additional parameters may follow each 
set name deoendina on the selection criteria for the member 
entry of the object record's type. If the root set in the 
selection path Has "NATABASE<KEY" specified with the 
*"data=base-datasname=1" onotion (see Ref. ec, Section 3.4.0), 
a pointer to an array containing a data base key or an item 
name of tyoe dhkey must he included. If the root set has 


the "CALC=KEY" option with the "data-base-data-snames” 
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specified, item names or pointers to data whose type matches 
that of the corresponding items in the CALC key must be 
included. For each set after the root in the selection path 
which uses the "EQUAL TO data=base-data=name-4" option (see 
Ref. 2, Section 3.4.0), an item name or pointer to data 
which matches the type of the data item specified in the 
selection clause must be included. In addition to the 
explicit parameters above, all data items needed inthe 
selection path as specified in the selection clause must be 
soecified. If the owner record's "WITHIN" clause specifies 
multiple areas, “areaid"” or the aopropriate data item must 
be initialized to the appropriate area. See Reference e, 
section 3.4.11 for a description of the selection clause. 
If a set name is specified with no additional oarameters, 


then the set used 18 the current set of that type. 


If the set names are specified, the record must 
not be in an occurence of any of the named set tyoe. Tf no 
set names are snecifieds the record is inserted into the 
current occurence of each set type for which the record is 
defined as optional automaticr optional manual or mandatory 
manual orovided the record does not already participate in a 
set of that type. After the insert, the record becomes” the 


current record of every set to which it has deen added. 


If an error occurs durina an insert, tne data 
vase remains unchanaed, no currency indicators change and 
the approoriate error code 1s returned iNto error.status. 


The possible error conditions are as follows. 
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0713 


0714 


0705 


0710 


0716 


0720 


Ciel 


0709 


0718 


0724 


0761 


The current record of the process is unknown. 


Set names are specified and the record is not 


defined as an optional automatic, optional manual 


or mandatory manual member of each of them. 


The record, when inserted, would violate a "DUPLI- 


CATES NOT ALLOWED" clause for some record or set 


involved. 


The current record of some set name snecified in a 


"CURRENT" clause of a selection entry is unknown. 


The record is already in an occurence of a set 


explicitly specified or of every set implicitly 


soecified. 


The record tyoe was opassed as @2@ parameter and 


disagrees with the tyoe of the current record of 


the process. 


A record occurence which is affected is tn an area 


which 1§ not open. 


The record inserted or some affected record 1s in 


an area which is ooen for retrieval only. 


A record occurence imolicitly referenced hy the 


insert is in an area which is not available. 


Insertion of the record into a set would cause the 


set to have members in hoth temoorary and pere 


manent areas. 


No call to the oermit function has been made. 





kK. Remove. 


This function is used to cancel the membershio 
of the current record of the orocess in specified set 
occurences for which the record's membership is optional. 
The parameters, which are optional, are a record name and 
one or more set names. If the set names are specified, the 
object record must oarticipate in an occurence of at least 
one of them and its membershio in each of them is canceled. 


If no set mames are specified, every ootional membership in 


a set occurence for the record is cancelled. 


If an error occurs during the remover, no. set 
memberships are canceled, no currency information is 
affected and the error condition is returned into 
error.status. The followina errors are possible. 

1113 The current record of the process is not known. 

1120 A record tyne parameter was* voassed and it 
disaqrees with that of the current record of the 
process. 

1115 The record is not defined as an ootional member of 
any named set type. 

1122 The record does not particivate in at least one of 
the sets nmamed; or if no sets are named, in at 
least one of the voossible sets for which it is 
optional. 

1110 A privacy breach was attempted. 

1121 Some record affected by the remove is in an area 


which 1S not open. 
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The current record of the process or some affected 
record is in an area which is not open for update. 
Some implicitly referenced recora is in an area 
opened for exclusive use by another process. 


No call to the permit function has been made. 
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APPENDIX B. FILES ASSOCIATED WITH A SCHEMA, 


A. Files in the Schema Directory. 


Most of the files associated with a schema are contained 
im a directory bearina the name of the schema. This direc~ 
tory becomes the current directory for the schema DBM opro- 
gram. In the Beseinicuior of the files within the schema 
directory, the term "schemaname” indicates a variable por- 
tion of a file mame which is replaced by the name of the 


particular schema when the files are named. 
- Source Descriction File. 


The Source Description File contains the Schema 
descriotion in the source CODASYL DOL form. Its name 1s 


"“s.schemaname”. 
2. Encoded Description File. 


The Encoded Description File contains the comoiled 
description of the schema. It contains data base names and 
encoded descriptions for the areas, records and sets in the 
schema. It 18s used orincipally by the schema DBM program in 


the initialization orocess. Its name is "des.schemaname", 
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3. Schema DBM Program, 


The schema DBM proaqram is the data base manager for 
the schema. Tt is comprised of the DBM skeleton routine 
compiled together with any data base procedures used in the 


schema. Its name is "dbm.schemaname". 
eee Schema Library. 


This file is optional and, when present, contains 
Gata base procedures unjaue to the schema. It 1s named 


"lib.seschemaname", 
Seeeee rea Data Files. 


These files contain the data for all the defined 
areas in the schema which are not designated as temporary. 


Their names are the same as the areas which they represent. 
6. Area Data Base Key Files. 


These files contain the byte offsets associated with 
each data base key for the areas which are not desiqnated as 
temoorary. The files are named by orefixina the area name 


by we 
7. Index Block File. 


This file provides storaae for all] the indices used 


for set linkaae in the data base. ee is called 


"index.schemaname". 
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8. Open Lockout File. 


This file is used by intearity routines to lock out 
other orocesses”~ when setting up exclusive or protected 
access priveleges for a user process. It is created to ini- 
tiate any open or close operation and removed when the 


operation is completed. The file is named "oopendum", 
9. Index Lockout File. 


This file is used to lock out other processes while 
attemoting to acaguire an index from the tndex block file. 
It is handled in a manner analogous to the open lockout 


file. The file is name “indexdum". 
10. Message Buffer File. 


This file is used by the schema DBM program to 
assemble messaaes to the user proaram which are lonaer the 
Sie characters. Prior to storing a new messaqe, the file is 


truncated to zero lenath. 


ol Files in the Temporary Directory. 


Certain files for a schema are stored in the UNIX tem- 
porary directory ("/tmo"). This directory has the charac- 
teristic that should a system crash occurs, al} the files 
contained within it are Jost. This directory is used to 
store files which are associated with the running of a pro- 
cess and therefore should be lost if the orocess 1S ter- 


minated by a system crash. The files are as follows. 
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i. Area Files. 


These files are the area data files and area data 
base key files for all areas desiqnated as temporary. The 
naming conventions for these files are identical to those 
for non-temoorary data and data base key files with the exe- 
ception that the erocess id (pid) of their user process is 


suffixed to the name. 


ee Logical Usage Block File. 


This file contains the loaica] usade. block. This 
block 1s used durina open and close operations to record the 
usage modes for the various areas currently in use. Its name 


is the same as the name of the schema. 





APPENDIX C. ORM = THE DRM REQUEST PROCESSOR. 
| 


Am Introduction. 


Dom 1S a Simple command Jlanquage oerocessor for schema 
level requests. It enables the data administrator to per- 
form such functions as compilina a schema, moving data. from 
schema to schema and aarbage collection. It provides the 
user with a method of executing a oroqram to utilize the 
schema. The functions of "ALTER", "“OISPLAY" and "LOCKS" 
described in Ref. 2 are provided by different means. 
Namely, the UNIX "ed" and "list" functions are used, with 
privacy provided by the file access privacy of UNIX. The 
function "COPY" (for subschema use) 1s inapplicable since 
the C lanquage DDL is not a proper subset of the schema ODL 


as was the case with COBOL. In additionr the cross checking 


of the sub-schema and the schema is done at execution time. 


Prior to usina dbm, the schema directory must exist 
(UNIX funetion "mkdir") and the schema source file should 
have already been created using “ed", the UNIX text editor. 
Dom is called by entering “dom oathname", where the pathname 
i$ a path endina with the schema name of the schema to be 
used. The proaqram will respond in one of two ways: it will 
display “cannot access schema" or ">". The first  resoonse 
indicates that either the schema soecified does not exist; 


access orivacy orevents access, the schema is not a 
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directorys or a file called "s.schemaname" does not exist in 
the directory, where “schemaname" is the name of the schema. 
This response is followed by immediate termination of the 
program. The second response is the dbm promot. character 


and means that dom is ready to accept commands. 


B. Commands. 


Upon recievina the oromot character, the user has the 


option of soecifyina any of six commands as follows. 
les =«€6OhL OmDI le. 


The comoile command causes the schema to be com- 


# Ut 


piled. The command format 1S c" followed by a carriage 
return. The compilation orocess causes the scanning of the 
schema source file, "“s.schemaname", and creation of the 
encoded schema description file, "des.schemaname", and the 
schema data base manaaer, "“dom.schemaname". If the necessary 
permissions are not oresent to create these files, dbm 
displays "cannot compile". If errors exist in the source 
file, they are disolaved. In order to divert the error list, 
an optional path name parameter is allowed with the "ce" com- 
mand. If the specified file can be opened, the error listina 
is output to itr otherwise an error message is disnlayed at 


the user's terminal. When the “c" command is finished, the 


user receives a promot. The compiler 1s currently a stub. 





ee ©6€—MoVe. 


This command allows data to be moved from an old 
version of a data base to the current one. The command for- 
mat is "m" followed by a oath to a. schema Girectory. Move 
will check the specified schema name to determine if it is a 
directory containing an encoded schema description and 
schema data base manager. If the schema is nonexistent or 
inaccessable, dbm will display "schemaname cannot be 
accessed", otherwise the data from the desiqnated schema 
will be moved to the current schema. The data moved is 
selected by finding all area, record and set entries with 
common names and transfering the data which is associated 
with these common areas, records and sets. Areary record and 
set entries should have the same order in both schemas. Al} 
data oresently in the current schema will be lost. If the 
move is umsuccessful, move oroduces error messaaes. After 


the move is completed, the user receives a promot. The move 


function 1s currently a stub. 
3. Execute. 


This command causes the execution of a user program 
to access the data base. Its format is “x" followed by a 
path to a user program and the araquments for that user pro- 
gram. If the user orogram 1s inaccesSable,r nonexistant or 
not a program, dom orintsS an error messaqe and prompts. 


Otherwise dbm executes the user program and, upon its termi- 


nation, promors. 





4. Garbaane Collection. 


This command allows waste compression in area data 


files for the data base. The format is "g 


#9 


followed by a 
carriage return. This causes the followina events for. each 
area in the schema. The messaae "number of bytes wasted in 
areaname is NNN. Collect? (€y or n)" to be displayed, where 
"areaname" is the area beina processed and "NNN" jis the area 


waste count. Enterine y causes the area file to be 
recreated with al! records written in ascending order o f 
data base key and with al! wasted space eliminated. Entere 


ing "n" causes the next area to be processed. When all areas 


have been porocessed, the user 15 promoted. 


4 


Due to the lack of a aarbaaqe collection facility in 
the schema dbm skeleton, freauent aarbage collection may be 
necessary. Note aarbage collection causes any assianment of 


Gata base keys desiaqned to juxtapose related records to be 


reflected in the area data file as well. 


Sie Free. 


If UNIX crashes durina a dom soawned function, cer- 
tain files may be left in a state makina restart imoossible. 
The command "f" followed by carriage return causes this con- 
dition to be eliminated. The free command removes the files 
"“opendum" and "indexdum" from the schema directory, if they 
exist, and scans the index block file, "index.schemaname", 


freeing any locked indices. 





C. Interorocess Communication. 


Whenver dbm must create a orocess, it uses the UNIX 
functions "“fork" and "exec". The former causes a complete 
copy of the current process, called the childs to be created 
and the later causes the current process to he overlaid and 
reolaced by the proaram specified. Dom creates children as 
needed to do its work. Whenever a need may arise for the 
children to communicate with each other, dbm creates inter 
process communications pipes. These pipes appear to be a 
pair of files, called the ends, each with an open file 
descriptor. One end of the pipe is open for reading and the 
other for writing. Since all children of a parent executina 
a pipe call have the pipe open also, the pipe can be used 
for passing data back and forth. Certain orotocols must be 
observed, however. The oiove can only effectively be used 
for one way transmission Since there is no. protocol for 
preventing a process from readina its BO anaties fans back 
before the intended receiver has a chance to read them. The 
receiver should close the writing end of the vive, otherwise 
the receiver will wait forever if trying to read the p1pe 
after the sending process has terminated. This phenomenon is 
caused by the fact that the process reading a pipe will go 
into wait state if any process, includina the orocess doina 
the read, has its writing end of the pive ooven. If no prom 
cess has its writing end open (termination automatically 
closes all of a process' oven files and pioes), a read on 


the pioe will return an end of file condition. 
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APPENDIX 0. SCHEMA DESCRIPTION FILE FORMAT, 


The schema description file contains the encoded schema 
description. The file is used by the dbm move command and 
in Initializing the schema OBM = proaram. Its format is 


described below. 


A. Schema Entry. 


The schema entry is headed by a nul] terminated string 
containing the schema name. Next is a privacy lock consist- 
ing of a nul) character, if no privacy lock is defined? or a 
one character type followed by a null terminatated string, 
1f a privacy lock is defined. Tf the lock is defined, a 


lock type of "s" indicates that the string is a lock strina 


and "o indicates a lock data base procedure name. 


Bs Area Entries. 


The area entries are oreceded by a two byte Number which 
is the number of areas areas and a two byte maximum record 


size. Each entry contains the followina items. 


The area name 1s a null terminated string. The tem= 
porary indicator is a one character flaa which is equal to 


one for temoorary areas and zero otherwise. 
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Fourteen data base procedure names are stored next. The 
fmmest six nam@s are orocedures to b@ called when the open 
functions for retrieval, protected retrieval, exclusive 
retrieval, update, protected update, and exclusive update, 
are executed normally. The seventh name is a procedure to 
be executed when a close is executed normally. The final 
seven names are procedures correspondina to the first seven, 
but which are executed when errors occur. ee 3 proéedure is 


not specified for a function, a nul} string wil] anpear 39 


the file at the anprooriate position. 


Following the data base procedure names are the. six 
privacy lock entries. These locks have the same format as 
the schema privacy lock. The six locks apply to the open 
function for retrieval, protected retrieval, exclusive 
retrieval, update, orotected update and exclusive update 


respectively. 


C. Record Entries. 


The record entries include information aenerated by the 
member subentries of the schema's set entries as well as 
information from the schema's record entries. The record 
entries are preceded bv a one bvte number indicating the 
number of record tynes present. Fach record entry contains 


the following data. 


The record name is a null terminated string. It is fol- 


lowed by a two byte sianed inteaqer indicatina record size 
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for records of this entry type. A ane byte location mode is 
next. Additional location mode information may follow 
depending on the mode: for modes zero and sever, no addi- 
tional information? for mode one, a one byte record index 
and a one byte item index: for modes two and three, two. or 
more one byte item index numbers oreceded by a one byte 
number indicating the number of indices present} for modes 
four and five, a null terminated string naming a data base 
procedure and two or more one byte item indices preceded by 
a one byte number indicating the number of indices present;3 
and for mode sSixr a one byte set index. The location mode 
infromation is derived form the LOCATION clause of the 


record entry and the encodina matches that in "rlocmod" of a 


schema DBM record vector. 


Following the location information is the area data 
derived from the record type's WITHIN clause. This consists 
of a one byte opotion code and area specifications. The area 
specification format deoends on the option code: for code 
zeror, aoone oyte area index; for code ones, two or more one 
byte area indices preceded by a one byte number indicating 
the number of indices present; and for two, no further data. 
The encoding of the WITHIN information matches that of 


“rparea”" in the schema DRM record vector. 


Fourteen data base orocedure names are stored next in 
the entries. The first seven are orocedures to be called 
when the functions of "insert", "remove", "store", "delete", 


"modify", "find" and "get", are executed normally on records 
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of this entry's type. The second seven data base procedures 
are called when any of the above seven functions is executed 
and an error occurs. If no orocedure is defined ira the 


schema for a functions a null string replaces the function. 


Following the data base procedure names are seven 
orivacy lock entries. These locks have the same format as 
the schema orivacy lock. The locks apply to the seven func- 


tions listed in the previous paragraph. 
1. Member Data. 


Each record entry has zero or more set. membership 
entries followina the record orivacy locks. These entries 
are preceded by a one byte number indicatina the number of 
membership entries oeresent. The member entries for each 
record type appear in the same order as the set entries in 


the schema for which membership is defined. The contents of 


each membership entry 18S aS follows. 


The set name for the membership is stored as a nul! 
terminated strina. Following the set name is a two byte 
series of flaq bytes which correspond to the bits of 
*mflags"” in a schema DBM member vector. The information 
contained in these bits is derived from the MEMBER clause, 


the KEY clause and the total number of SEARCH clauses 


defined in the schema. 


Next 1§ a@ one byte number indicating the number of 


items included in the crimary key for ties t em. This number 
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is at most (6 and is zero if no key is defined. Following 
this value is the appropriate number of orimary key element 
pairs. Each key oair consists of a one byte collatina code 
and an item index. The collating code is zero if this ele= 
ment of the orimary key is ascending and one if it is des- 


cending. 


Following the primary key svecification are up to 
seven search key strinas (the exact number is recorded in 
the flag bytes above). Each search key string is a null] 
terminated strina of the item indices for the items in the 


search key. 


Following the search key strings is the set selece- 
mom data. If Format 2 of the SELECTION clause was used, 
this data consists of the name of a data base procedure. If 
Format 1 was used, the data is as follows. First is a one 
byte code indicatina the root selection mode. The code 
corresponds to that in "“mselflaa" of a schema DBM member 
vector. The remaining root selection data depends on the 
root selection mode. For mode one and twor a one dyte set 
index follows the mode. For mode three, there is no further 
root selection data. For mode four, the data is a nul] tem- 


teaated strina of two byte pairs each of which contains a 


record index and a set index. 


The remaining set selection data for Format 1 cone 
sists of the number of "THEN THRU" clauses followed by the 


aporopriate number of two byte selection pairs. Each oair 
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contains a set index and an item index. These pairs are the 


source data for "mssel" in a schema DBM member vector. 


Following the set selection information are six data 
base orocedure names. The first three Orocedures are called 
when the functions of "insert", "remove" and "find" are exes 
cuted normally. The second three procedures are called when 
these same functions are executed and an error results. If 


no data base orocedure is defined for a poarticular function, 


a null string will aooear in its oosition. 


Following the data base orocedure names are three 
privacy lock entries in the same format as the schema lock. 
These privacy locks are for the functions described in the 


orevious paraaraph. 
Ce Item Data. 


Each record entry has one or more item  descriotion 
entries following the set membershion entries. The item 
entries are preceded by a one byte number indicating the 
number of items present. The item entries are stored in the 
Same order that the items they represent apoear in the 
record type being described. The contents of each item 


entry is as follows. 


The first data in an item entry is the name of the 
item stored as a null terminated string. If the item is one 
that is not generated by an item sub-entry in the schema, 


the item name will be a nul! strina. Followina the name is 





a one byte level number. A level number between one and 100 
iS generated by a schema item subsentry; 101 is a forward 
links 102 is a backward link? 103 is a link to owner; 104 is 
an owner's link to first member? 105 is an owner's link to 
last member; 106 is an owner's link to index and 107 is a 
CALC synonym link. 


\ 
The remaining data depends on the level number. If 


the level number is between one and 100, inclusive, the next 
byte contains the data types between 101 and 106, inclusive, 
the next byte is the set index; and for 107 no other data is 
needed. If a picture is defined for tne item, it is stored 


next as a null terminated strina. 


Following the level and tyoe data is a validity 
checking description. The validity checking description is 
a nul] terminated strina which is encoded to fit the 
requirements of "“icheck" in a schema DBM item vector. It is 
generated by the CHECK clause of a schema item  sub-entry. 
If no validity check is defined for the Cent the string is 


mits | < 


Following the validity check description are three, 
two byte numbers representing the size Cin bytes) of one 
occurence of the item; tne number of occurences of the item 
in a record; and the starting tyte number of the item within 


the record. 


The names of six data base orocedures are next. The 


frest three are names of orocedures to be called when the 
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functions of "store", "get", and “modify" are executed nore 
mally. The second three procedures are called when these 
functions are executed and an error OCCUrS. If 3 procedure 


1S not defined for a function, anull string will apoear in 


Its place. 


Following the data base procedure names are three 
privacy lock entries in the same format as the schema lock. 
These locks apely to the functions mentioned in the previous 


paragraph. 


Upeee vet Entries. 


Following the record entries are the set entries. These 
entries are qenerated by set subwrentries in the schema. The 
entries are preceded hy a one byte number indicating ane 
number of sets defined. The contents of each entry is as 


follows. : 


The first element in each entry is the set name stored 
as anull terminated strina. The set name is followed by a 
one byte code which corresponds to the lower order byte of 


"“sflags" in a schema DBM vector and describes OWNER, SET IS 


and ORDER clauses of the set subwentry. 


Next is a pair of bytes indicating the owner record. 
The first byte is the owner record's index and the second is 
the item index of the first item in the owner record having 
to do with the sete. Following the owner record data is a 


one byte number indicatina the number of member records and 
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three byte member descriptions oresent. Each member 
description consists of the record index of the member; the 
index of the set membershio vector in the schema DBM record 
vector for the record? and the item index of the first item 
in the record having to do with this set. The order of the 


member descriotions is alohabetical by member record name. 


Following the member descriptions are four data base 
procedure names stored as null terminated strings. The 
first two are names of procedures to be called when the 
"insert" or "“remove" functions are executed normally. The 
last two represent the same functions, but are called when 
an error occurs. If a procedure is not defined in the 


schema for a functions a null string apoears in that olace. 


Following the data base procedure names are three 
orivacy locks. These locks are of the usual format. They 
lock the functions of "insert", “remove” and "find", respec=- 


tively. 
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APPENDIX E. INTFRPROCESS MESSAGE FORMATS. 


A, Messages Received by the Schema DBM. 


Messages received are read into "“smesin", a character 
buffer of lenath 512. The first byte of the messaaqe 15 a 
function code. The remainder of the message will vary 
depending on the function code. The message is terminated 
Dy a mark, which is ten bytes of the octal code d0252. In 
the message descriptions that follow, the function code is 


included as part of the description heading. 
1. Initial Call Message (Code 0). 


The initial call is made by the user to request 
validation of his sub-schema and to establish his access 
permissions. Immediately following the function code is a 
null terminated string containing the schema name. After 
the schema name 18 a2 null terminated string containing the 


orivacy key tor the schema. Following the schema entries 


are the area entries. 
ax Area Entries. 


The area entries are preceded by a one byte 
number indicatina the number of areas In the sub=schema. 
Each entry consists of seven null terminated strinas. The 


first string is the area name, the other six strings are 
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privacy keys and mav he nul} strings. The orivacy keys 
specified are for retrieval, protected retrieval, exclusive 
retrieval, update, orotected uodate and exclusive update 


respectively. 
bs Record Entries. 


The record entries are oreceded by a one byte 
number indicating the number of records in the sub-schema. 
Each entry consists of seven nul! terminated strinas, = an 
encoded member list and an encoded item list. Teenie 
string is the record name and the remaining six are orivacy 
kKeyS and may be null strinas. The privacy keys are for 


insert, remover, store, delete, modify and find respectively. 


An encoded member list is headed by a one byte 
number indicatina how many member entries followe Each 
member entry consists of four null terminated strinas. The 
first string is the name of the set and the remaining three 


are privacy keys and may be null. The privacy keys are for 


insert, remove and find respectively. 


An encoded item list is headed by a ane bvte 
number indicating how many item entries are in the list. 
Each entry has a one byte entry code followed by four null 
terminated strings. The first string 18 the item name and 
the rest are privacy keys and may be null. The privacy keys 
are for storer, get and modify. The remainder of the item 


entry varies demending on the entry code specified below. 
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C1) Atomic Item (Code 0). No further fields 


exist in the item entry for an atomic item. 


(2) Vector (Code 1). A one byte number indi-~- 


cating the number of occurences of the item follows the 
‘\ 
OrPivacy keys. 
(3) Repeatina Groupw (Code 2). A pair of one 
byte numbers follows the privacy keys. The first number 
indicates the number of subsequent item entries mn the 


repeating group and the second indicates the number of 


occurences in the group. 
Cr. Set Entries. 


The set entries follow the record entries. They 
are oreceded by a one byte number indicating the number of 
set entries. Each set entry consists of four null tere 
minated strings. The first strina is the set name and the 
remaining three are orivacy keys and may be null. The 


orivacy keys are specified for imsert, remove and find. 
2. Ooen Message (Code 9). 


The function code is followed by a one byte mode, 
which uses the same encodina as the C dbopen function (see 
Appendix Ar Section §.2.6). The remainder of the message 
consists of one byte area index numbers. No area numbers 
should be included if every area known to the subschema is 


to be ooened. 
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3. Close Message (Code 1). 


The function code is followed by a list of one byte 
area index numbers. No area numbers are included in the 


message if all open areas are to he closed. 
4. Find Message (Code 3), 


The function code is followed by a one byte  selec= 
tion tyoe and selection codes. The possible selection tynes 


and their corresponding record selection codes are: 


Code zero indicates direct access and the selec 


e 


tion code will be a four byte data base key. 
b. Owner Record. 


Code one indicates selection of the owner record 
for the set of the Specified tyoe that the soecified Recore 
belonas to. The first selection code is a one byte set 


index and the second 1s a four byte data base key. 
ce Relative Area. 


Tyoe code two specifies relative selection in 
the designated area. The first selection code is a one byte 
criterion with zero, one, two and three meaning next, orevis= 
ous, first and last, resvectively: and four, five, six and 
seven meaning next, previous, first and last of a specified 


record tyoe. laf the criterion 1s four throucqh seven, the 
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second selection code is a one byte record index. The last 


selection code is a four byte data base key. 
de. Relative Set. 


Type code three specifies relative selection in 
the designated set, The first selection code is a one byte 
set index. The remaining selection codes are identical to 


those for type code two. 
e. CALC Key. 


Type code four indicates hash key selection. 
The first selection code is a one byte record index. The 
remaining codes are item triples for all the items of the 
specified record tyoe which are know to the sub-schema and 
are not associated with an "“QCCURS" cliause. An item triple 
consists of an item specification, a one byte data type 
code, and a data value of the specified type. An item 
specification code consists of a one byte item index fol- 
lowed by zero or more one byte subscript values as appropries= 
ate. The data tvpe codes are one for inteaer, two for sin- 
gle precision floatina ooint, three for double orecision 
Hicatinmo COInt, four for nul} terminated string and five for 


data base key. 
$< Duplicate CALC Key. 


Type code five indicates selection of the next 
record with a hash key duplicating the hash key o f the 


soecified record. The selection code is a four byte data 
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base key. 
Ge Current Set Data Value. 


Type code six indicates that the first record of 
the specified type which matches the specified item values 
in the specified set occurence is to be selected. The first 
selection code is the one byte record type index of the 
record type to be selected. The second selection code is a 
one byte set type index. The third selection code is a four 
byte data base key of a record which belongs to the. set 
occurence to be scanned. The remainina selection codes are 


zero or more item trioles. 
he Selected Set Data Value. 


Type code seven indicates that the first record 
of the specified tyoe which matches the specified item 
values in the set occurence selected through the specifed 
record tyoe's member subentry "SELECTION" clause is 
selected. The first two selection codes are identical to 


those in the orecedina oaraaraoh. 


The third selection code is a one byte number 
which indicates the number of path selection codes which 
follow. The oath selection codes are item auadruples. An 
jtem quadruple consists of a one byte record indexr an item 
specifications, a one tyte data type coder and a data value 
of the type soecified in the data tyne code. The remainina 


selection codes are zero or more item triclies for items in 
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the specified record type. 
Nile Current Set Duolicate Value. 


Type code eight indicates that the record to be 
selected is the next record (Cif any) which is of the same 
type as the snecified record? in the set of the specified 
types and matches the spvnecified record in the specified 
items. The first selection code is a four byte data base 
KEY The second selection code is a set type index. The 


remaining selection codes are item specifications. 
5. Get Messace (Code 5). 


Following the function code is a four byte data base 
KEY. The remainder of the messaae consists of item doubles 
for the items the user program desires. An item double con- 
sists of an item specification and a one byte data tyne. An 
omitted subscript in a data specification means every 
occurence of the vector or repeating group is desired. A 
double for a repeating arouc has a data type of zero. The 
doubles for the elements in the repeating aroun must be 
immediately followina the repeatina aroup's double. 


6. Store Messaaqe (Code ie). 
ace 


The remainder of the message 1S a2 one obvte record 


type index. 
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7. Reswoonse to Request for Data Message (Code 100). 


This message crovides data requested in a Reauest 
for Data Messaaqe sent by the schema DBM. The function code 
is followed by an item specification for each requested item 
consisting of a one byte record tyne indexs, a one byte item 
index and a one bvte data tyoe. The requested data follows 
the item type specifications in exactly the same order as in 
the Request for Data. For aroup items, the order of apoear- 
ance of the subordinate items of the group in the item 
specifications is the order the items must appear within 
each occurence of the group item 1n the data portion of the 
message. The other data may be area ae set type indices or 


data base keys. 
Ge Insert Messaaqe (Code 7). 


The function code is followed by a four byte data 
base key of the record to be inserted. The remainder of the 
message consists of set specifier pairs. A set specifier 
pair consists of a one byte set type index followed by a 
four byte data base key indicatina the current record in the 


current occurence of the set of the type specified. 
9. Remove Message (Code 11). 


Following the function code is a four byte data base 


KEY. The remainder of the messaqe 1S zero or more one byte 


set type indices. 
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10. Modify Message (Code 8). 


Followina the function code is a one byte number 
indicating how many set membershios are being modified, fol- 
lowed by the required number of set specifier pairs. The 
remainder of the message consists of zero or more item tri- 


oles. 
11. Delete Messaae (Code 2). 


Followina the function code is 3a four byte data 
base Key. The remainder of the message 1s aoone byte deleo- 
tion code with the same values as for the parameter of the C 


"delete" function (see Apoendix A, Section B.2.i). 


B. Messages Transmitted by the Schema DBM. 


Messages transmitted are in response axe) messaqes 
received and fall into two categories: normal responses and 
error messaaes. The first byte of the message is a response 
code and is zero for normal responses and equal to the error 
code for error messages. The format of the responses, after 
the first byte, varies depending on the oreviously received 
message (for normal resoonses) or on the error type (for 


error messaaqes). These formats are detailed below. 


hie Normal Responses. 
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as Initial Call Messaqe. 


The normal response to an initial cal} 1S aN 
encoded schema description for the user. This descriction 
1S a series of one byte numbers each of which represents the 
index number associated with the data base names (except the 
schema name) in the Initial Call Message. For areas, this 
1S the index number of the areas for records, the index 


number of the record followed by the index number of each 


data item or data aagregate; for sets, the set number. 
be. Find Message. 


The normal response to a Find Message contains 
the information necessary to establish currency for the 
selected record. This information consists of a four byte 
data base keys a one byte record tyoe index indicating the 
type of the records and zero or more one byte set tvype 
indices indicatina the set types for the set occurences, in 


which the record particivates. 
ce Store Messaae. 


The normal response to a Store Messaae is a 
Request for Data Messacqe. This message requests the data 
needed to perform the store function. The message is com- 
posed of reauest entries each prefixed by a ane byte request 
type code. The request entry formats are listed below alona 


with their reauest tyoe codes. 
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(1) Mata Item Reauest (Code 0). This reauest 
entry consists of an item specification. A repeating aroup 
index is an implied request for all subordinate elements. in 
the repeating group known to the sub=-schema. If a subscript 
is missinar the data in all occurences of the relevant ele- 


ment 1S requested. 


(2) Area Index Request (Code 1). This request 
consists of the reauest code alone, It reauests the con- 


tents of "“areaid” in the user oroaram. 


(3) Data Base Key (Code 2). This reauest con- 
sists of the request code alone. Tt requests the data base 


key associated with "keyname" in the user orogram. 


(4) Current of Set. This request consists of a 
set type Index. It requests the data base key of the 


current record of the soecified set type. 
Ge Request for Data Message. 


TRe normal resoonse to a Request for Data is 
identical to that for a find. The record information passed 


is for the record just stored. 
e. Messaaes with a Null Response. 


The normal response to certain messages 1s Fs 


resoonse code only. These messages are Groen, Close, Insert, 
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Moaify, Remove and Delete. 
Ce Error Messaaes. 
Be Invalid Sub=schema (Code 60). 


This resoonse is the result of a mismatch 
between the schema and the sub=schema oresented in the user 
program's Initial Call Message. After the error code is a 
one byte first error sven zero for schema entry; one for 
area entryr two for record entry; and three for set entry. 
Following this code are three, one byte entries giving the 
number of the first erroneous entry in areas, record and set 


entries resvoectively. 
b. Area Already Ooen (Code 28). 


This response to an Open Message has a one byte 


error count following the response code. 
c. Truncation of Data (Code 54). 


This response to a Get Message has identical 
format to the normal response to a Get Messaae except for 


the response code. 


d. Messaaes with Error Code Only. The remainina 


error resnonses consist of an error code only as follows. 


(1) Data Base Key Invalid (Code ec). 
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4). 


(€egde 5). 


(Codee 9). 


(Code 


(Code 


Poor. 


18). 


C2) 


Cs) 


(4) 


Cay) 


(6) 


(7) 


(8) 


eo) 


(10) 


C11) 


Gia) 


C15) 


(14) 


Data Items Invalid or Inconsistent (Code 


Violation of DUPLICATES NOT ALLOWED clause 


End of Set or Area (Code 7). 
Invalid Record or Set Index (Code 8). 


Attempted Uodate on Retrieval Only Area 


Privacy Breach Attempted (Code 10). 
Media Space not Available (Code 11). 
Data Base Key not Available (Code 12). 


Insert into Mandatory Automatic Set (Code 


Remove out of Mandatory Set (Code 15). 


Insert into Set with Existing Membershio 


Imolicitly Referenced Area not Available 


Affected Area not Onen (Code 21). 
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eta) 


(16) 


Illegal Area Index (Code 23). 


Set Occurence would Soan Temporary and 


Permanent Areas (Code 24), 


Gar) 


ments (Code 25). 


Cis) 


Expression (Code 


ce, 


(203 


figoce 29). 


C2) 


Set (Code 30). 


(2c) 


(2 5,) 


(Code 100). 


No Set Occurence Satisfies Specified Araqu- 


No Record Satisfies Record Selection 


Go Je. 


CHECK Clause Violated (Code 27). 


Usaae Mode Conflict with Other Processes 


Unaualified DELETE on Owner of a Non-empty 


No Initial Call Messaae (Code 61). 


Indecicherable or Unoroccessable Message 
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APPENDIX F. NBM SKELETON PROGRAM, 


The skeleton program is identical for every schema DBM, 
Comoiled schema ODBM's differ only in the values associated 
with certain DEFINE'd constants controlling array sizes, in 
the initialization of certain arrays, and in the data base 
procedures which are included in the compiled version. When 
the schema DBM is executed, it initializes its tables from 
the schema description in the Schema Description File. 
These tables drive the processing of the data base. The 
schema DBM concurrently reads the user's sub-schema descrip- 
tion from the interprocess communication pipe. The sub- 
schema description is validated and index numbers are pro- 
duced to allow translation of user requests and data into 


system requests and data. 


This Aonendix describes the data organization of the 
skeleton. Documentation for each service routine and util- 
ity routine 18 contained in the source program iTIlistings. 
Listings and machine readable source of the DBM skeleton can 
be obtained oy contacting the Department of Comouter Science 
(Code SeRs). For an explanation of the values associated 
with DEFINE'd constants mentioned itn this Appendix see 
Mpoendix H. For a description of a user's view of the ser- 


vice routines, see Apoendix A. 
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A. General Tables. 


Certain tables and buffers are available for use by 


other tables. 


1. The character buffer, "scharbuf", is dimensioned by a 
DEFINE'd constant. It is used to stored character strings 


and other relatively short variable lenath data. 


2. “Procname” is an array of strings containing the 
names of the data base procedures. "“Procpoint"™ is -an array 
ere tTunctiom pointers pointing to the functions defined in 
“orocname". These arrays are used to set up the data base 
procedure pointers used in other tables. Both "“procname" 


and "procpoint"™ are dimensioned by a DEFINE'd constant. 


3. The orivacy vector array ("pvec"), dimensioned by a 
DEFINE'd constant, is used for data item and data agaregate 
privacy information. Its elements are structures) of type 


e 


"“orivect". The format of a “orivect"” structure is 


Struct orivect { 
char otype,s //@tayoe of DRivacy elec k 
Chan mes Oll.oc K+ // oointer to privacy lock 


} 
The "ptype"™ code is "s" if the "plock" poointer ooints to a 


string, and “po", if a data base procedure 18 indicated. 


4. The record buffer array, “Srecbuf", contains all the 


record buffers for areas, records and sets. 
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B. Organization for Area Management. 


The schema DBM contains an array of structues of type 


"areavect", dimensioned 
"avec". Each structure 


area mn the data hase. 


ture is as follows: 


Struct areavect 
int aflags; 
int ause; 


by a DEFINE'd constant, called 
in the array is used to describe. an 


The format of an “areavect" struce= 


// see below 
// wsage count of last reference 


char *adatapath:// sointer to path to data file 


Int adatades; 
char xakeypath; 
int akeydes; 


- 


// file descriptor for data file 
// vointer to oath to key file 
// file descriptor for key file 


char acrecloc(3}; 4// location of current record 


char *acurrec, 
int acurkeyle); 


// pointer to current rec buffer 


= 


// first key # in current key buff 


char akeybuf (768) 7// buffer for db key manopings 
int (*provec)()F14)7//7 pointers to db sorocedures 


int aoflaaqs, 


int awaste, 
} 


// permission flaas for functions 
// current waste count 


"Aflags” is formed by a bitswise OR of the following 


octal codes: 
TEMP 0100000 
PHSOP 040000 
KEYBMOD 020000 
CRECMOD 010000 
KEYBVAL 04000 
CRECVAL 02000 


Sheets iZ 01090 


KNOWN 0400 
RETRV 01 
PRETRV O02 
BERETRV 03 


The area is temoorary 

The files are physically open 
Current key block modified 
Current record modified 

Key block is valid 

Record buffer is valid 

Current record has increased size 
Area is known to the sub-schema 
Area open for retrieval 

Area onen for protected retrieval 


Area open for exclusive retrieval 
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UPDT 04 Area open for update 
PUPDT os Area open for protected update 


EUPOT 06. Area open for exclusive update 


The “anoflags”"™ are set when the user sub=-schema is vali= 
dated. These flags indicate the functions allowed the user 


for the area as follows: 


0100000 Retrieval 
040000 Protected retrieval 
020000 Exclusive retrieval 
010000 Update 
04000 Protected uodate 
02000 Exclusive uodate 


C. ‘Logical Usage Block. 


The logical usaaqe block records the current usage mode 
for each area in the data base currently being used by any 
schema DBM. The loaical usaae block is organized into two 
byte integer entries, one for each area in the data base. 
Each two byte entry is divided into four fields: bit 15 is 
the exclusive use bit? bits 14 through ten form a count of 
retrievers; bits nine throuah fiver a count of orotected 
retrievers; and bits four through zero form a count of 
uodaters. The loaical usaae block can record uo to 31 users 


1m each category. If a schema DBM has an area oven for a 


protected mode, the updater count 1s set to 31. 
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De Organization for Record Management. 


The schema DBM contains an array of structures of type 
"rpecvector". Each element of the array 1§ the record vector 
for a record type defined in the schema. The format of a 
“recvector"” structure is: 

Struct @recvector { | 

int rflags; // see below 
char rlocmod(3];3 // see below 
char rareal3); // see below 
int Cxrprovec) ()(14)% // oointers to db procedures 
char rnumsets; // number of set tyoes for record 
struct member xrsets,; // oointer to member vectors 
char rnumitem; // number of items in record 
Struct itemvect *ritems;s // pointer to item vectors 
int roflags; // permission flags for functions 
char *x*rcurrecs // pointer to current record buffer 
} 

"Rflaas” is currently used only to indicate whether or not 


the sub-schema knows about the record type. The octal code 


KNOWN (0400) is used for this function. 


"Rlocmod" is derived from the LOCATION clause of the 
schema RECORD entry (see Section 3.3.4. of Ref. 2) and is 
interoreted as follows. Character zero gives the location 
mode: zero for OTRECT with key passed as a parameter; one 
for DIRECT with key stored in a record; two for CALC using 
the standard key transformation with no duplicates? three 
for CALC using the standard key transformation with duplie- 
cates allowed, four for CALC using a data base procedure 
with no duplicates; five for CALC using a data base pro- 
cedure with duplicates allowed; six for VIA a set? and seven 
for SYSTEM mode. The last two bytes in the "“rlocmod" vary 


in meaning dependina on the mode. For mode zeror bytes one 
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and two are unused. For mode onery byte one contains the 
record type and byte two, the item index of the data item 
which holds the data base key. For modes two through five, 
bytes one and two hold a pointer to the randomizing key 
description. A randomizing description is a null terminated 
series of bytes, the first containina the item index of the 
key link item and subsequent bytes containing the item 
indices of the fields of the randomizing key. For modes 
four and five, the randomizing key descriotion is headed by 
@ pointer to the appropriate data base procedure. For mode 
six, byte one contains the set index of the set to be con- 


sulted and byte two is unused. 


"Rarea" is derived from the WITHIN clause of the schema 
RECORD entry and is formatted as follows. Byte zero is the 
WITHIN option code and has the following interpretation: 
zero, all records are within a sinale areas one, multiple 
areas are possible (selected by a user input value); two, 
the area will be the area of the owner of the set of a 
specified type in which the record oarticipates. The values 
of bytes one and two of "rarea" vary deoendina on the WITHIN 
option: for zero and twor byte one contains an area index 
number and byte two 1S unused; for one, bytes one and two 
contain a pointer to a WITHIN criteria. A WITHIN criteria 
is a null terminated strina of bytes each containing one of 


the allowed area numoers for this record. 
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1. Member Vectors. 


Each record vector contains a pointer ("rsets"”) ee: 
an array of set membershio vectors if it oarticipates in any 
sets. A set membership vector 1S ) Structure of type 


"member". The format of the "member" structure is: 


Struct member { 


char msetnum; // set index for this entry 
int mflags; // see below 
int morder; /e flag bits for key item | 

0 = ascending, 1 = descendinar/ 
char xmokey? // pointer to items for prime key 
char xxmskey; // pointer to SEARCH index pointers 
char mselflags // root selection flag 
int *mselid; // pointer to root selection id 


int (*mssel)()3 // pointer to set selection srec 
int (xmorovec)()(6])7%// pointer to db procedures 
char molfags; // oermission flags for functions 
} 


The "mflaaqs" for a member entry is formed by a bit- 


wise OR of the followina codes: 


MMAND 0100000 Membership 1s mandatory 

MAUTO 040000 Membership 1S automatic 

MLINK 020000 Member is linked to owner 
MSSEL 010000 Set selection by db procedure 
MPKEY 04000 Primary key is defined 

MPRKEY 02000 RANGE ootion 

MPFKEY 01000 Duplicates first 

MPLKEY 0400 Duplicates last 

MPDKEY 0200 Duplicates arbitrary 

MPNKEY 0100 Nulls allowed 

MKNOWN 040 Membershin 1S known to sub-schema 


Whenever anv of MPRKEY through MPNKEY are set, MPKEY must be 
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Set. Additionally, the low order three bits of “mflags”" 
contain the number of secondary indices defined to support 


SEARCH keys for this membership. 


The pointer “mskey" points to a string of pointers 
dimensioned by the count stored in "“mflags". Each pointer 
points to a null terminated strina. The first byte of this 
string indicates whether duplicates are allowed; the second 
byte is the item index for the owner record item linking the 
Search tndex; and the remaining bytes are item indiceser each 
representing a field in the SEARCH key for the search index. 


Duplicates are allowed if the first byte of the string 1S a 


one and not allowed if it 18 a2 two. 


"Moflags™ is set when the sub-schema is. validated. 
These flags indicate the functions allowed the user for the 


record/set pair as follows: 


0200 Insert 
0100 Remove 
040 Find. 


When "mflags" has MSSEL set, "mssel" is a pointer to 
a data base procedure for set selection and “mselflag" and 
"mselid" are unused. If MSSEL is not set, "mselflaa" is a 
code describing the root set selection in the set selection 
chain for this member. The possible values of "msel flag” 
are as follows: one for singular setss ec for current of set 
type; three for throuch data base key? and four for through 


CALC key. The data in "“mselid"™ depends on the value of 
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“mselflag". For singular sets and current of set selection, 
“mselid"” contains the set index of the root sete. For selec- 
tion by data base key, "mselid" is not used. For selection 
by CALC key, "“mselid" is a pointer to a null terminated 
string of character pairs which are the record and item 


indices of the items to be used in forming the CALC key. 


When set selection is not by a data base procedure 
and the number of THRU clauses in the SELECTION clause for 
this member entry is qreater than one, then a selection 
chain exists and "mssel" is a pointer to a null terminated 
string of byte nairs. Each pair in this string describes 
the set selection for one of the successive set types in the 
set selection chain. Each vair consists of the index of the 
next set in the chain and the index of the data item which 


must be matched in the owner record. 


cae Item Vectors. 


Esch Record Contains wa eooimeer FG fitems’) tos an 
array of item descriotion vectors. Each element of the 
array is a structure of tyne "“itemvect” and describes one of 
the fields apoearing in the record. A field may be in a 
record for CALC key linkaae, for set linkage or as a result 
of a data subw-entry in the record's source descriotion. The 


format of an “itemvect"™ structure 183 
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Struct itemvect { 


char *jiname; 4// pointer to name of item 

char ilevel? // item level 

char itype; 4// tyoe of data represented 

char *ides; // data description pointer 

char *icheck; // pointer to validity check 

int jisize; // $size of one occurence of item 
imt Vigmees 7 // number of occurences of item 
int isbyte; 4/ starting byte within record 


int (x*jiorovec)()(6)3// oointers to do procedures 

Struct orivector *iovecs// pointer to orivacy locks 

Char ipflags?; // ytem orivacy flaas 

} 

The "ji level” entry specifies the level number of the 
item. A level number beween one and 100 indicates the item 
was generated dy an item sub-entry; level 101, 3 forward 
chain link for a set; 102, a backward link; 103, a link to 
owner; 104, an owner's link to first member; £05, an owner's 


link to last member; 1067 an owner's link to index; and 107, 


a CALC synonym link. 


"Itvpe" is the data type code for the item: zero for 
repeating arouos;: one for a PICTURE'd character strina;s two 
for a PICTURE'd numeric strina; three for a binary integer; 
five for a single precision floatina point number; six for a 
double precision floating point number; seven for a charac- 
ter strings eight for a bit strinas and nine for a data base 


KEY. 


If the item is a set link, “ides” is the set index 
for the set. If the item has a oicture specified, “ides” is 
a pointer to a character string containina the ovicture (see 
Ref. 2, Section 3.3.8 for a descriotion of PICTURE'd data). 


In other cases, "ides" is unused. 
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"Icheck" is a pointer to ai validity checking 
description for the item. The first character in the 
description is a flag byte. If the high order bit of the 
flag byte is on, the picture is used as a check. If bit six 
1S On, a data base procedure is used as a check. rf over 
five is on, check values are used. If a data base procedure 
1s specifieds, a pointer to the procedure is stored immedi- 


ately after the flaq character. If check values are speci=- 


_ 
— 


fied they are stored at the end of the validity checking 
description. Check values consist of a series of check 
entries seperated by ASCII comma characters and terminated 
by a null byte. Each check entry is either a literal of the 
Same format as the item or a pair of such literals senarated 


by an ASCII dash character. 


The “ioflags" are set when the sub=schema is valio 


dated. The octal codes and function permissions are: 


0200 Store is permitted 
0100 Get is vermitted 
040 Modify is permitted 


ae Organization for Set Management. 


The schema DBM orogram contains an array of set vectors. 
Each set vector descrites one of the set types defined in 
the schema and is a structure of tyoe “setvect". The format 


of a "setvect”™ structure 1S° 
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Struct setvect { 


int sflags; // see helow 

char sowner; // owner record type 

char sfitem; // index of ist item for set 
char xsmemb; // oointer member description 
int Cxsprovec)()(4]% // pointers to db functions 
char spflags; // function permission flags 


char scurown(3]; // db kev of current owner rec 
Char *scurrec; // pointer to current record buf 
} 
The value of "sflags" is formed by a bit-wise OR of the 


following octal codes: 


KNOWN 0400 Set tyoe is Known to sub-schema 


SYSTEM 0200 Singular set 

DYNAMIC 0100 Dynamic set type 

PRIOR 040 Members contain backward links 

INDEXED 020 Primary set order is via an index 


The lower four bits of "sflaas" indicate the order criteria 
for the set: zero, the order is immaterials one, new records 
are inserted on the front of the set; two, new records. are 
inserted at the end of the set? three, new records are 
inserted after the current record of the set; four, new 
records are inserted prior to the current record; five 
through lls a sortina order. Five indicates sorted by data 
base key, Six, sorted by record names and then by member 
keys seven, sorted by the member record keys with relation- 
shio between records of different types immaterial; and 
eight throuaqh 11 indicate sorted by member keys (this 
implies that the format of each member record's keys is the 
same). The last four codes soecify duolicate orocessing: 
eight, duplicates are allowed? nine, duplicates are first? 


ten, duplicates are last? and baler duplicates are not 
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allowed. Items in the owner record dealing with the set are 


assumed to be stored contigously. 


"Smemb" points to a null terminated string of bytes 
indicating the member record types for the set. The string 
is made up of three byte entries. The first byte is the 
member record index? the second is the membership vector 
index of the member record for this set? and the third is 
the item vector index of the first item in the record deal- 
ing with this set. All items havina to do with the set. are 


assumed to be stored contiguously in the member records. 


The “soflags" are set when the sub-schema is validated. 


These flags indicate the function allowed the user for the 


set: 
0100 Insert is allowed 
040 Remove is allowed 
020 Find is allowed 
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APPEND LX Gos O DRPREGRENCES INTHE SCHEMA DDE. 


This Aopendix gives a detailed description of the 
differences between the DDL in the UNIX DBMS and that 
described in Ref. 2. The Appendix is organized in parallel 
with Section 3 of Ref. 2 and section references below are 
sections in Ref. 2 eaneee otherwise noted. The meta 
language used to describe entries is identical to that of 
Ref. 2 with the exceotions that no distinetion is made 
between required or opotional words and that options enclosed 
in brackets are separated by virgules ("/") in lieu of being 


on separate lines. 


A. Words. 


The rule in section 3.0.3 for forming words apolies to 
the OOL. However, when validating a sub-schema, the DBMS 
considers uoper and lower case Jetters to be equivalent and 


considers underscore ("€") a synonym for hyohen ("<-"), 


Beuwssehema Entry (Section 3.1.0). 


The “ON {ERROR DURING] " clause 18S not supported. In the 
MPRIVACY LOCK" clause, only the "{FOR COPY)" option is sup- 
ported. Specifying alternate privacy locks for the same 


function 1s not supoperted. 
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C. Area Entry (Section 3.2.0). 


Specifying alternate orivacy locks for the same function 


1S Not supported. 


D. Record Sub-entry (Section 3.3.0). 


In the "PRIVACY LOCK" clause, specifying alternate 
Orivacy locks is not supported. Specifying a data base proo- 
cedure for area selection in the "LOCATION MODE" clause is 


not supported. 


E. Data Sub-entry (Section 3.3.0). 


In the "TYPE" clause, the only arithmetic tyoes sup- 
ported are “BINARY FIXED", “BINARY FLOAT 4" and “BINARY 
FLOAT 8". The word "BINARY" ae assumed if missing, and "4" 
is assumed if neither "4" nor "8" is specified. If the 
"TYPE" clause uses the "BIT integer=-3" option, "“integer=3" 
must be a multiple of erdgqht. teams all records must be 
fixed length, the "OCCURS data-baseridentifier=-1 TIMES" 
option is not supported. "RESULT" and "SOURCE" items, both 
virtual and actual, are not supoorted. The alates. 
{ENCODING/DECODING}" clause is not supported. Specifying 


alternate orivacy locks on the same function is not sup 


ported. 
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Fe. Set Sub-entry (Section 3.4.0). 


The "TEMPORARY" ootion of the "ORDER" clause is not sup- 
ported. The “{INDEXED (NAME IS jindex=nameel)]]" clause is 
not supported. Specifying alternate orivacy locks tor the 


Same function 18 not SuUDpOrted. 


G. Member Subrentry (Section 3.4.0). 


In the "RANGE KEY" clause, no more than sixteen data- 
base-identifiers may be specitied. The "DUPLICATES NOT 
ALLOWED FOR" clause is not supported. No more than seven 
“SEARCH KEY" clauses can be specified. In the “SEARCH KFY" 
clause, the “USING* ohrase is not meaningful since all 
search keys are imolemented retina indices. In Format 1 of 
the "SET SELECTION" clause, the "“DATA=BASE“KEY EQUAL TO 
data=base-identifier-1" and "CALC-KEY EQUAL TO data-base- 
data=name-2 (data-base-data-name-3) ..." options are not 
supported. In the same clause, the only torm of the "THEN 
THRU" phrase supoorted is without the "EQUAL TO” Seption. 


Soecification of alternate privacy keys on the same function 


iS not suoported. 
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APPEND PX SSS CONSTANT FILE CONTENTS. 


As mentioned in Section III.t.l, a constant file must 
be created by the DBMS compiler to allow the skeleton pro- 
gram to be transformed into the schema DBM for a particular 
data base. This file dimensions the tables and arrays of 
Ane. sehen a DBM and initializes arrays for the processing of 
data base procedures. The specific tables and arrays are 


describved below. 


fn The Character Buffer. 


The character buffer, "scharbuf", is seizes ee) ip 
Sstoraqe of character strinas and several other types of 
variable length data. This character buffer must be large 
enough to contain the schema name; the path names to al} 
schema files Cincluding temporary ones)? the item names of 
every item in every record; the orimary key, search key and 
selection data for every membership vector; the data and 


validity check descriptions of every item vector in every 


record; and the member record strina for every set vector. 


ied 





B. Data Base Procedure Table. 


This table is composed of two arrays. The first array, 
"orocname", iS an array of character strings and must be 
initialized with the name of every data base procedure men-= 
tioned in the schema. The second array, "orocpoint", is an 
array of function pointers which must be initialized to 
point to the data base procedures listed in "procname". The 
references to data base procedures in “procpoint"™ cause the 


C compiler to load these functions into the schema DBM. 


C. Privacy Vector Array. 


This array of structures of type "privector” must. be 
dimensioned large enough to hold the maximum number of item 
privacy locks defined in any one record entry. The format 
of a "privector" structure is described in Appendix F, Sece- 


tion (a 


D. Record Buffer Array. 


This ts a character array which Is used to provide 
record buffers for the various areas, records and sets. Its 
dimension must be the number of areas and sets times the 


maximum record size olus the size of each tndividual record. 
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Es Area Vector Array. 


This array of structures of type "areavect™ must be 
dimensioned laraqe enough to provide one area vector for each 
defined area. An "“areavect" structure is described mn 


Appendix F, Section B, 


eae Record Vector Array. 


This array of structures of tyoe "recvector” must be 
dimensioned large enough to orovide one record vector for 
each defined record type. A descriotion of the "recvector" 


structure 1S contained in Aopendix F, Section D. 


G. Member Vector Array. 


This array of structures of type "member" must be dimen=- 


sioned larqe enough to provide one member vector for every 
record membership defined in every set. A descriotion of 
the "member”“ structure is contained in Appendix F, Section 
Dal. 


Hs Item Vector Array. 


This array of structures” of type "“jitemvect" must be 

\ 
dimensioned larae enough to provide an item vector for every 
item in every record tyoe,. A description of the "jtemvect" 


structure 18 contained in Appendix F, Section D.2. 


Vos 





I. Set Vector Array. 


-_- 


This arrav of structures of type "“setvect™ must be 


dimensioned large enough to provide a set vector for every 


set defined in the schema. A description of the "setvect" 


structure is contain in Appendix F, Section E. 
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